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INTRODUCTION 

The  Gary  Plan 

In  the  last  few  years  both  laymen  and  professional 
educators  have  engaged  in  a  lively  controversy  as  to  the 
merits  and  defects,  advantages  and  disadvantages  of 
what  has  come  to  be  called  the  Gary  idea  or  the  Gary 
plan.  The  rapidly  increasing  literature  bearing  on  the 
subject  is,  however,  deficient  in  details  and  too  often 
partisan  in  tone.  The  present  study  was  undertaken 
by  the  General  Education  Board  at  the  request  of  the 
Gary  school  authorities  for  the  purpose  of  presenting  an 
accurate  and  comprehensive  account  of  the  Gary  schools 
in  their  significant  aspects. 

In  the  several  volumes  in  which  the  main  features  of 
the  Gary  schools  are  separately  considered,  the  reader 
will  observe  that,  after  presenting  facts,  each  of  the 
authors  discusses  or — in  technical  phrase — attempts  to 
evaluate  the  Gary  plan  from  the  angle  of  his  particular 
interest.  Facts  were  gathered  in  a  patient,  painstaking, 
and  objective  fashion;  and  those  who  want  facts,  and 
facts  only,  will,  it  is  believed,  find  them  in  the  descriptive 
and  statistical  portions  of  the  respective  studies.  But 
the  successive  volumes  will  discuss  principles,  as  well  as 
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state  facts.  That  is,  the  authors  will  not  only  describe 
the  Gary  schools  in  the  frankest  manner,  as  they  founc 
them,  but  they  will  also  endeavor  to  interpret  them  in  the 
light  of  the  large  educational  movement  of  which  the} 
are  part.  An  educational  conception  may  be  sound  oi 
unsound;  any  particular  effort  to  embody  an  educa 
tional  conception  may  be  adequate  or  inadequate,  effec 
tive  or  ineffective.  The  public  is  interested  in  knowing 
whether  the  Gary  schools  as  now  conducted  are  efficienl 
or  inefficient;  the  public  is  also  interested  in  knowing 
whether  the  plan  as  such  is  sound  or  unsound.  The 
present  study  tries  to  do  justice  to  both  points. 

What  is  the  Gary  plan? 

Perhaps,  in  the  first  instance,  the  essential  features  oi 
the  Gary  plan  can  be  made  clear,  if,  instead  of  trying  tc 
tell  what  the  Gary  plan  is,  we  tell  what  it  is  not.  Ex- 
cept for  its  recent  origin  and  the  unusual  situation  a* 
respects  its  foreign  population,  Gary  resembles  many 
other  industrial  centers  that  are  to  be  found  throughout 
the  country.  Now,  had  Gary  provided  itself  with  the 
type  of  school  commonly  found  in  other  small  industrial 
American  towns,  we  should  find  there  half  a  dozen  01 
more  square  brick  "soap-box"  buildings,  each  accom- 
modating a  dozen  classes  pursuing  the  usual  book  studies 
a  playground,  with  little  or  no  equipment,  perhaps  z 
basement  room  for  manual  training,  a  laboratory,  and  z 
cooking  room  for  the  girls.  Had  Gary  played  safe,  this 
is  the  sort  of  school  and  school  equipment  that  it  wouk 
now  possess.    Provided  with  this  conventional  schoo 
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system,  the  town  would  have  led  a  conventional  school 
life — quiet,  unoffending,  and  negatively  happy — doing 
as  many  others  do,  doing  it  about  as  well  as  they  do  it 
and  satisfied  to  do  just  that. 

As  contrasted  with  education  of  this  meager  type,  the 
Gary  plan  is  distinguished  by  two  features,  intimately 
connected  with  each  other: 

First — the    enrichment    and    diversification    of    the 

curriculum; 
Second — the  administrative  device  that,  for  want  of  a 
better  name,  will  be  tentatively  termed  the  duplicate 
school  organization. 
These  two  features  must  first  be  considered  in  general 
terms,  if  the  reader  is  to  understand  the  detailed  descrip- 
tion and  discussion. 

As  to  the  curriculum  and  school  activities.  While 
the  practice  of  education  has  in  large  part  continued 
to  follow  traditional  paths,  the  progressive  literature  of 
the  subject  has  abounded  in  constructive  suggestions 
of  far-reaching  practical  significance.  Social,  political, 
and  industrial  changes  have  forced  upon  the  school 
responsibilities  formerly  laid  upon  the  home.  Once  the 
school  had  mainly  to  teach  the  elements  of  knowledge; 
now  the  school  is  charged  with  the  physical,  mental,  and 
social  training  of  the  child.  To  meet  these  needs  a 
changed  and  enriched  curriculum,  including  community 
activities,  facilities  for  recreation,  shop  work,  and  house- 
hold arts,  has  been  urged  on  the  content  side  of  school 
work;  the  transformation  of  school  aims  and  discipline 
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on  the  basis  of  modern  psychology,  ethics,  and  social 
philosophy  has  been  for  similar  reasons  recommended  on 
the  side  of  attitude  and  method. 

These  things  have  been  in  the  air.  Every  one  of  them 
has  been  tried  and  is  being  practised  in  some  form  or 
other,  somewhere  or  other.  In  probably  every  large 
city  in  the  country  efforts  have  been  made,  especially  in 
the  more  recent  school  plants,  to  develop  some  of  the 
features  above  mentioned.  There  has  been  a  distinct, 
unmistakable,  and  general  trend  toward  making  the 
school  a  place  where  children  "live"  as  well  as  "learn." 
This  movement  did  not  originate  at  Gary;  nor  is  Gary  its 
only  evidence.  It  is  none  the  less  true  that  perhaps  no- 
where else  have  the  schools  so  deliberately  and  explicitly 
avowed  this  modern  policy.  The  Gary  schools  are  offi- 
cially described  as  "work,  study,  and  play"  schools — 
schools,  that  is,  that  try  to  respond  adequately  to  a  many- 
sided  responsibility;  how  far  and  with  what  success,  the 
successive  reports  of  the  Gary  survey  will  show. 

It  must  not,  however,  be  supposed  that  the  enriched 
curriculum  was  applied  in  its  present  form  at  the  out- 
set or  that  it  is  equally  well  developed  in  all  the  Gary 
schools.  Far  from  it.  There  has  been  a  distinct  and 
uneven  process  of  development  at  Gary;  sometimes,  as 
subsequent  chapters  will  show,  such  rapid  and  unstable 
development  that  our  account  may  in  certain  respects 
be  obsolete  before  it  is  printed.  When  the  Emerson 
school  was  opened  in  1909,  the  equipment  in  laboratories, 
shops,  and  museums,  while  doubtless  superior  to  what 
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-was  offered  by  other  towns  of  the  Gary  type,  could  have 
"been  matched  by  what  was  to  be  found  in  many  of  the 
better  favored  larger  towns  and  cities  at  the  same  period. 
The  gymnasium,  for  example,  was  not  more  than  one 
third  its  present  size;  the  industrial  work  was  not  un- 
precedented in  kind  or  extent;  the  boys  had  woodwork, 
the  girls  cooking  and  sewing.  But  progress  was  rapid: 
painting  and  printing  were  added  in  191 1;  the  foundry, 
forge,  and  machine  shop  in  1913.  The  opportunities 
for  girls  were  enlarged  by  the  addition  of  the  cafeteria  in 
1913.  The  auditorium  reached  its  present  extended  use 
as  recently  as  the  school  year  1913-14.  The  Froebel 
school,  first  occupied  in  the  fall  of  191 2,  started  with 
facilities  similar  to  those  previously  introduced  piecemeal 
into  the  Emerson. 

These  facilities,  covering  in  their  development  a  period 
of  years,  represent  the  effort  to  create  an  elementary 
school  more  nearly  adequate  to  the  needs  of  modern 
urban  life.  The  curriculum  is  enriched  by  various  ac- 
tivities in  the  fields  of  industry,  science,  and  recreation. 
Questions  as  to  the  efficiency  with  which  these  varied 
activities  have  been  administered  will  be  discussed  by 
the  various  contributors  to  the  present  study.  Mean- 
while, it  is  perhaps  only  fair  to  point  out  that  the  modern 
movement  calls  not  only  for  additions  to,  but  elimina- 
tions from,  the  curriculum  and  for  a  critical  attitude 
toward  the  products  of  classroom  teaching.  How  far,  on 
the  academic  side,  the  Gary  schools  reflect  this  aspect 
of  the  modern  movement  will  also  presently  appear. 
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The  administrative  device — the  "duplicate"  organiza- 
tion, noted  above  as  the  second  characteristic  feature  of 
the  Gary  plan — stands  on  a  somewhat  different  footing, 
as  the  following  considerations  make  plain. 

Once  more,  Mr.  Wirt  was  not  the  inventor  of  the  in- 
tensive use  of  school  buildings,  though  he  was  among  the 
first — if  not  the  very  first — to  perceive  the  purely  educa- 
tional advantage  to  which  the  situation  could  be  turned. 
The  rapidity  with  which  American  cities  have  grown  has 
created  a  difficult  problem  for  school  administrators — 
the  problem  of  providing  space  and  instruction  for  chil- 
dren who  increase  in  number  faster  than  buildings  are 
constructed.  The  problem  has  been  handled  in  various 
ways.  In  one  place,  the  regular  school  day  has  been 
shortened  and  two  different  sets  of  children  attending  at 
different  hours  have  been  taught  daily  in  one  building 
and  by  one  group  of  teachers.  Elsewhere,  as  in  certain 
high  schools,  a  complete  double  session  has  been  con- 
ducted. The  use  of  one  set  of  schoolrooms  for  more  than 
one  set  of  children  each  day  did  not  therefore  originate 
at  Gary. 

Another  point  needs  to  be  considered  before  we  discuss 
the  so-called  duplicate  feature  of  the  Gary  plan.  In 
American  colleges,  subjects  have  commonly  been  taught 
by  specialists,  not  by  class  teachers.  The  work  is  "de- 
partmentalized"— to  use  the  technical  term.  There  is 
a  teacher  of  Latin,  a  teacher  of  mathematics,  a  teacher 
of  physics,  who  together  instruct  every  class — not  a 
separate  teacher  of  each  class  in  all  subjects.    Latterly, 
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departmentalization  has  spread  from  the  college  into 
'the  high  school,  until  nowadays  well  organized  high 
schools  and  the  upper  grades  of  elementary  schools  arc 
quite  generally  "  departmentalized,"  i.e.,  organized  with 
special  teachers  for  the  several  subjects,  rather  than 
with  one  teacher  for  each  grade. 

Out  of  these  two  elements,  Gary  has  evolved  an  admin- 
istrative device,  the  so-called  duplicate  school,  which, 
from  the  standpoint  of  its  present  educational  signifi- 
cance, does  indeed  represent  a  definite  innovation. 

For  the  sake  of  clearness,  it  will  be  well  to  explain  the 
theory  of  the  duplicate  school  by  a  simplified  imaginary 
example: 

Let  us  suppose  that  elementary  school  facilities  have 
to  be  provided  for,  say,  1,600  children.  If  each  class  is 
to  contain  a  maximum  of  40  children,  a  schoolhouse  of 
40  rooms  would  formerly  have  been  built,  with  perhaps 
a  few  additional  rooms,  little  used,  for  special  activities; 
except  during  the  recess  (12  to  1:30)  each  recitation 
room  would  be  in  practically  continuous  use  in  the  old- 
line  subjects  from  9  to  3 130,  when  school  is  adjourned  till 
next  morning.  A  school  plant  of  this  kind  may  be 
represented  by  Figure  I,  each  square  representing  a 
schoolroom. 

The  "duplicate"  school  proposes  a  different  solution. 
Instead  of  providing  40  classrooms  for  40  classes,  it 
requires  20  classrooms,  capable  of  holding  800  children; 
and  further,  playgrounds,  laboratories,  shops,  gardens, 
gymnasium,  and  auditorium,  also  capable  of  holding 
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800  children.  If,  now,  800  children  use  the  classrooms 
while  800  are  using  the  other  facilities,  morning  and  after- 
noon, the  entire  plant  accommodates  1,600  pupils 
throughout  the  school  day;  and  the  curriculum  is  greatly 
enriched,  since,  without  taking  away  anything  from  their 
classroom  work,  they  are  getting  other  branches  also.  A 
school  thus  equipped  and  organized  may  be  represented 

figure  1 
represents  old-fashioned  schoolhouse 

40  rooms  for  40  classes,  of  40  children  each,  L  e.,  facilities  for  the  academic  instruc- 
tion of  1,600  children.  A  school  yard  and  an  extra  room  or  two,  little  used,  for  special 
activities,  are  also  usually  found. 


1 

by  Figure  U,  in  which  A  represents  20  classes  taking 
care  of  40  children  each  (800  children) ,  and  B  represents 
special  facilities  taking  care  of  800  children.  As  A 
and  B  are  in  simultaneous  operation,  1,600  children  are 
cared  for. 

This  method  of  visualizing  the  "duplicate"  school 
serves  to  correct  a  common  misconception.  The  plan 
aims  to  intensify  the  use  of  schoolrooms ;  yet  it  would  be 
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incorrect  to  say  that  20  classrooms,  instead  of  40, 
as  under  the  old  plan,  accommodate  1,600  children. 
For  while  the  number  of  classrooms  has  been  reduced 
from  40  to  20,  special  facilities  of  equal  capacity  have 
been  added  in  the  form  of  auditorium,  shops,  play- 
ground, etc.      The   20  classrooms  apparently  saved 

figure  n 

REPRESENTS  THE  GARY  EQUIPMENT 

A  B 

90  classrooms  for  academic  instruction  Special  facilities,  taking  care  of  8cochil- 

of  so  classes  of  40  children  each  (800  chil-  dren  in  the  morning  hours  and  an  equal 

dren)  in  the  morning  hours  and  an  equal  number  in  the  afternoon  hours  (1,600  in  all 
m 
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iren)  in  the  morning  hours  and  an  equal       number  ii 
lumber  in  the  afternoon  (i, 600  in  all  daily)       daily) 


Auditorium 

Shops 

Laboratories 

Playground,  gardens, 
gymnasium  and  library 

have  been  replaced  by  special  facilities  of  one  kind  or 
another.  The  so-called  duplicate  organization  and 
the  longer  school  day  make  it  possible  to  give  larger 
facilities  to  twice  as  many  children  as  the  classrooms  alone 
would  accommodate.  The  duplicate  school,  as  devel- 
oped at  Gary,  is  not  therefore  a  device  to  relieve  conges- 
tion or  to  reduce  expense,  but  the  natural  result  of 
efforts  to  provide  a  richer  school  life  for  all  children. 
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The  enriched  curriculum  and  the  duplicate  organ- 
ization support  each  other.  The  social  situation  re- 
quires a  scheme  of  education  fairly  adequate  to  the 
entire  scope  of  the  child's  activities  and  possibilities; 
this  cannot  be  achieved  without  a  longer  school  day  and 
a  more  varied  school  equipment.  The  duplicate  school 
endeavors  to  give  the  longer  day,  the  richer  curriculum, 
and  the  more  varied  activities  with  the  lowest  possible 
investment  in,  and  the  most  intensive  use  of,  the  school 
plant.  The  so-called  duplicate  school  is  thus  a  single 
school  with  two  different  types  of  facilities  in  more  or  less 
constant  and  simultaneous  operation,  morning  and 
afternoon. 

Such  is  the  Gary  plan  in  conception.  What  about  the 
execution?  Is  it  realized  at  Gary?  Does  it  work? 
What  is  involved  as  respects  space,  investment,  etc., 
when  ordinary  classrooms  are  replaced  by  shops,  play- 
grounds, and  laboratories?  Can  a  given  equipment  in 
the  way  of  auditorium,  shops,  etc.,  handle  precisely 
the  same  number  of  children  accommodated  in  the  class- 
rooms without  doing  violence  to  their  educational  needs 
on  the  one  hand,  and  without  waste  through  temporary 
disuse  of  the  special  facilities,  on  the  other?  To  what 
extent  has  Gary  modified  or  reorganized  on  modern  lines 
the  treatment  of  the  common  classroom  subjects?  How 
efficient  is  instruction  in  the  usual  academic  studies  as 
well  as  in  the  newer  or  so-called  modern  subjects  and 
activities?  Is  the  plan  economical  in  the  sense  that 
equal  educational  advantages  cannot  be  procured  by 
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any  other  scheme  except  at  greater  cost?  These  and 
other  questions  as  to  the  execution  of  the  Gary  plan  are, 
as  far  as  data  were  obtainable,  discussed  in  the  separate 
volumes  making  up  the  present  survey. 

The  concrete  questions  above  mentioned  do  not,  how- 
ever, exhaust  the  educational  values  of  a  given  school 
situation.  From  every  school  system  there  come  im- 
ponderable products,  bad  as  well  as  good.  Aside  from 
all  else,  many  observers  of  the  Gary  schools  report  one 
such  imponderable  in  the  form  of  a  spiritual  something 
which  can  hardly  be  included  in  a  study  of  administra- 
tion and  eludes  the  testing  of  classroom  work.  These 
observers  have  no  way  of  knowing  whether  Gary  school 
costs  are  high  or  low;  whether  the  pupils  spell  and  add  as 
well  as  children  do  elsewhere;  but,  however  these  things 
may  be,  they  usually  describe  the  pupils  as  characterized 
by  self-possession,  resourcefulness,  and  happiness  to  an 
unusual  degree.  While  different  schools  and  indeed 
different  parts  of  the  same  school  vary  in  this  respect, 
the  members  of  the  survey  staff  agree  that,  on  the  whole, 
there  is  a  basis  of  fact  for  these  observations.  Gary  is 
thus  something  more  than  a  school  organization  charac- 
terized by  the  two  main  features  above  discussed. 

The  reason  is  not  far  to  seek.  Innovation  is  stimu- 
lating, just  as  conformity  is  deadening.  Experiment 
is  in  this  sense  a  thing  wholesome  in  itself.  Of  course 
it  must  be  held  to  strict  accountability  for  results;  and 
this  study  is  the  work  of  persons  who,  convinced  of  the 
necessity  of  educational  progress,  are  at  the  same  time 
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solicitous  that  the  outcome  be  carefully  observed. 
The  fact  that  customary  school  procedure  does  not  rest 
upon  a  scientific  basis,  does  not  willingly  submit  itself 
to  thorough  scrutiny,  is  no  reason  for  exempting  educa- 
tional innovations  from  strict  accountability.  The  very- 
reverse  is  indeed  true ;  for  otherwise  innovation  may  im- 
peril or  sacrifice  essential  educational  values,  without 
actually  knowing  whether  or  not  it  has  achieved  definite 
values  of  its  own.  Faith  in  a  new  program  does  not 
absolve  the  reformer  from  a  watchful  and  critical  atti- 
tude toward  results.  Moreover,  if  the  innovator  for- 
mulates his  purposes  in  definite  terms  and  measures  his 
results  in  the  light  of  his  professed  aims,  the  conservative 
cannot  permanently  escape  the  same  process.  Gary,  like 
all  other  educational  experiments,  must  be  held  account- 
able in  this  fashion.  Subject  however  to  such  ac- 
countability, the  breaking  of  the  conventional  school 
framework,  the  introduction  of  new  subject  matter  or 
equipment,  even  administrative  reorganization,  at  Gary  as 
elsewhere,  tend  to  favor  a  fresher,  more  vigorous  interest 
and  spirit.  Defects  will  in  the  following  pages  be  pointed 
out  in  the  Gary  schools — defects  of  organization,  of  ad- 
ministration, of  instruction.  But  there  is  for  the  reasons 
just  suggested  something  in  the  Gary  schools  over  and 
above  the  Gary  plan.  Problems  abound,  as  in  every 
living  and  developing  situation.  But  the  problems 
are  the  problems  of  life,  and,  as  such,  are  in  the  long 
run  perhaps  more  hopeful  than  the  relatively  smooth 
functioning  of  a  stationary  school  system.     Thus,  not- 
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withstanding  the  defects  and  shortcomings  which  this 
study  will  candidly  point  out,  the  experiment  at  Gary 
rightly  observed  and  interpreted  is  both  interesting  and 
stimulating. 


AUTHOR'S  PREFACE 

The  present  account  of  the  measurement  of  certain 
products  of  classroom  teaching  at  Gary  presents  in 
detail  the  results  of  the  various  tests  given  for  the  pur- 
pose of  measuring  the  degree  of  efficiency  with  which  the 
common  school  subjects  were  taught  at  the  time  of  the 
investigation  of  the  Gary  schools.    Because  of  the  care 
'with  which  the  testing  work  at  Gary  was  conducted, 
and  the  number  and  variety  of  the  tests  given,  the  data 
secured  seem  to  the  author  to  have  thrown  much  light 
upon  some  of  the  problems  fundamental  to  all  meas- 
urement   work.    He  believes  that  the  testing  move- 
ment has  now  reached  a  stage  in   which   a   critical 
study  of  the  validity  of  the  results  secured  may  be  both  in- 
teresting and  beneficial.     Accordingly,  he  has  ventured 
to  discuss  at  considerable  length  in  Section  2  of  each 
chapter  the  technique  of  classroom  testing  and  the  va- 
rious factors  which  affect  the  results.    The  report  ought, 
therefore,  to  be  of  interest  not  only  because  it  deals  with 
Gary,  but  because  it  attempts  a  general  critical  dis- 
cussion of  tests  and  testing. 

A  volume  of  this  character  is  never  the  product  of  a 
single  mind.  The  writer  would  be  ungrateful  indeed 
did  he  not  acknowledge  his  indebtedness  for  the  great 
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and  varied  assistance  he  has  received:  First  and  fore- 
most, to  the  superintendent,  teachers,  and  pupils  of  the 
Gary  schools,  without  whose  fullest  cooperation  the 
thoroughness  of  the  testing  work  could  not  have  been 
attained;  to  Messrs.  Ayres,  Thorndike,  and  Judd,  for 
aid  in  the  interpretation  of  the  data  secured;  and  finally, 
and  perhaps  most  of  all,  to  the  enthusiasm,  disinterested 
service  and  fidelity  of  the  six  young  men  who  served  as 
his  assistants,  Messrs.  Paul  C.  Packer,  E.  J.  Ashbaugh, 
George  C.  Brandenburg,  Leo  J.  Brueckner,  J.  W. 
Richardson,  and  E.  H.  Lauer.  It  is  through  their  ar- 
duous labors,  intelligent  cooperation,  conscientious  per- 
formance of  assigned  tasks,  that  the  original  design  of 
the  survey  has  been  carried  out  as  planned. 

For  such  errors  in  planning,  execution,  and  expression, 
as  may  be  found,  the  writer  accepts  full  responsibility. 
They  indicate  merely  his  limitations  and  his  inability 
to  profit  fully  by  the  generous  and  loyal  assistance  which 
all  have  been  glad  to  give. 


MEASUREMENT  OF  CLASSROOM  PRODUCTS 


I.    GENERAL  STATEMENT 

Status  of  Educational  Measurement 

THROUGHOUT  this  report  the  reader  will  need 
to  keep  constantly  in  mind  the  fact  that  educa- 
tional measurement  is  a  recent  development. 
The  first  reliable  scale  for  measurement  of  any  educational 
product  was  published  early  in  1910;  measurement  was 
first  used  in  a  large  modern  survey  in  191 2.  Even 
to-day  there  are  probably  hundreds  of  educational 
workers  who  have  never  heard  of  measurement,  and 
hundreds  more  who  have  not  the  faintest  conception 
of  the  fundamental  principles  involved. 

On  the  other  hand,  so  rapid  has  been  the  development 
that  a  bureau  for  educational  research  was  established 
in  a  city  school  system  by  September,  1913.  To-day 
there  is  a  National  Association  of  Directors  of  Educa- 
tional Research  with  a  membership  of  thirty-odd  men 
and  women  who  are  giving  their  time  wholly,  or  mainly, 
to  such  work.  Courses  in  educational  measurement  are 
given  in  most  university  schools  of  education  and  in 
many  normal  schools.  No  survey  of  a  school  system 
would  to-day  be  attempted  without  making  provision  to 
secure  objective  evidence  through  educational  measure- 
ment upon  which  to  base  conclusions. 
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However,  the  mushroom  growth  of  the  movement  for 
measurement,  the  superficial  use  of  untested  tools,  the 
hasty  generalization  from  insufficient  data  do  not  make 
for  confidence.  There  is  at  the  present  moment  a  very 
great  danger  that  educational  measurement  will  be  dis- 
credited more  by  the  reckless  optimism  of  its  friends 
than  by  the  attacks  of  its  enemies.  Yet  of  the  value  of 
measurement  itself  there  can  be  no  question;  the  methods 
of  science  are  not  on  trial.  There  is  no  question,  even, 
of  the  applicability  of  methods  of  scientific  measurement 
to  educational  problems.  The  one  thing  that  is  needed 
is  time,  time  to  study  the  measuring  instruments  them- 
selves before  accepting  them  as  perfect,  time  to  formulate 
fully  a  problem  before  attempting  to  solve  it,  time  to 
gather  reliable  data  and  to  digest  them  before  arriving 
at  conclusions. 

LIMITATIONS  OF  MEASUREMENT 

In  the  appraisal  of  educational  innovations  the  ulti- 
mate question  must  ever  be:  "What  is  the  effect  upon 
the  children?,,  However  widely  a  school  system  may 
depart  from  established  usage,  however  much  its  ex- 
perimental modifications  of  either  theory  or  practice 
may  seem  injudicious  and  undesirable,  if  it  could  be 
shown  by  impersonal,  objective  measurement  that  be- 
cause of  the  changes  that  had  been  made,  the  graduates 
of  the  school  system  in  question  are  better  developed, 
better  trained,  and  generally  more  desirable  members 
of  society,  all  other  judgments  would  have  to  be  reversed. 
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Unfortunately,  it  is  not  possible  at  the  present  time 
to  measure  completely  and  objectively  the  total  educa- 
tional product  of  a  school  system.  For  many  years  to 
come,  expert  opinion  based  upon  inspection  must  be  our 
only  means  of  estimating  the  general  worth  of  such  edu- 
cational experiments  as  that  at  Gary.  But  in  certain 
phases  of  school  work  measurement  is  possible,  and  such 
measurement  serves  as  a  check  upon  the  observations 
and  opinions  of  the  inspecting  experts.  If,  for  example, 
inspection  is  made  of  the  teaching  of  spelling  and  such 
teaching  is  judged  to  be  faulty,  while  the  objective  tests 
of  spelling  ability  prove  that  the  children  spell  very 
much  better  than  children  of  the  same  grade  or  age  in 
conventional  school  systems,  then  the  ability  of  the 
judges  to  pass  upon  the  innovations  may  well  be  doubted. 
If,  on  the  other  hand,  the  results  of  the  objective  tests  and 
the  subjective  opinion  of  the  inspectors  are  in  complete 
agreement,  then  the  judgments  of  the  experts  in  those 
matters  in  which  no  such  checks  are  at  present  possible 
may  be  accepted  with  greater  confidence.  Educational 
measurement  in  its  present  stage  of  development  must 
necessarily  deal  with  but  few  of  the  many  products  of 
educational  work.  It  is  more  valuable  as  a  check 
upon  the  more  extended  (but  less  reliable)  subjective 
judgments  of  a  survey  staff  than  as  a  complete  and  in- 
dependent determination  of  the  merits  or  demerits  of  a 
school  system.  Therefore,  this  report  should  be  read  and 
interpreted  in  connection  with  that  on  Organization  and 
Administration,  and  the  chapters,  in  The  Gary  Schools: 
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A  General  Account,  dealing  with  the  Course  of  Study 
and  Instruction. 

TITLE 

The  title  chosen  for  this  report  calls  for  explanation. 
Measurement  itself  is  so  new,  the  total  of  our  scientific 
knowledge  of  tests  and  testing  so  meager,  that  in  spite 
of  the  limitations  implied  in  the  use  of  the  word  "  certain," 
the  title  "Measurement  of  Certain  Products  of  Class- 
room Teaching"  is  still  too  sweeping  in  its  claims.  Any 
ordinary  writing  lesson  may  awaken  the  dormant  self- 
consciousness  of  a  child  to  a  sense  of  his  own  power, 
the  mastery  of  a  spelling  difficulty  may  strengthen  the 
fibers  of  his  character,  the  deadliest  grind  found  in  the 
most  mechanical  school  may,  for  certain  individuals, 
serve  to  organize  and  direct  their  energies  toward  worthy 
aims.  Stimulation,  character  building  and  inspiration 
may  be  products  of  classroom  teaching  of  the  common 
branches  no  less  than  the  grosser  ^elements  really  meas- 
ured by  our  educational  tests.  When,  therefore,  we 
undertake  to  measure  classroom  products  and  measure 
only  efficiency  in  certain  mechanical  abilities  we  may  miss 
wholly  certain  other  products  of  equal  or  greater  value. 
If  all  the  ideas,  which  a  strict  regard  for  the  truth  would 
require,  were  to  appear  in  the  title,  a  statement  some- 
thing like  the  following  would  have  to  be  used:  "A 
report  of  an  attempt  to  measure  a  few  phases  of  certain 
products  of  classroom  teaching  in  four  of  the  largest 
public  schools  at  Gary."  The  shorter  title  should  be  so 
read  as  to  connote  the  longer. 
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Yet,  after  all,  it  may  be  that  the  shorter  title  does  not 
so  culpably  misrepresent  the  truth.  True  stimulation 
awakens  a  child  to  his  school  opportunities  no  less  than 
to  those  of  the  outside  world.  True  character  building 
is  made  manifest  by  the  spirit  in  which  daily  tasks  are 
performed,  whether  these  tasks  are  in  the  school  or  in 
the  home.  True  inspiration  produces  as  substantial 
achievement  in  childhood  as  in  the  prime  of  life.  There- 
fore, when  measurement  of  the  higher  phases  of  the  prod- 
ucts of  classroom  teaching  becomes  possible,  it  may  be 
that  we  shall  find  the  higher  and  the  lower  so  indissolubly 
linked  together  that  from  the  measurement  of  one,  the 
degree  of  development  of  the  other  may  be  inferred. 
The  present  study  is,  at  least,  an  honest  attempt  to 
evaluate  completely  and  thoroughly  those  elements  in 
the  situation  which  are  now  measurable. 

RELIABILITY 

Because  the  dangers  and  limitations  of  measurement 
were  fully  recognized,  the  attempt  was  made  at  Gary  to 
secure  results  as  reliable  as  it  is  possible  to  make  them 
at  present.  Each  subject  was  tested  in  more  than  one 
way,  and  great  care  was  taken  to  control  the  conditions 
under  which  the  tests  were  given  and  scored.  Tests 
of  the  product  of  teaching  of  the  elementary  schools  were 
carried  through  the  high  school  grades  as  well,  both  to 
determine  how  the  abilities  developed  in  the  lower 
grades  were  affected  by  high  school  work,  and  to  see 
whether  or  not  any  marked  changes  in  product  had 
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occurred  in  recent  years.  From  the  point  of  view  of 
thoroughness,  therefore,  and  within  their  narrow  field, 
the  results  are  probably  more  complete  and  hence  more 
reliable  than  those  of  previous  surveys. 

PLAN  OF  REPORT 

No  effort,  time,  or  expense  has  been  spared  on  this 
report.  And  yet,  as  the  time  comes  for  publication, 
there  comes  also  the  realization  that  what  has  been 
written  may  very  easily  be  misunderstood.  A  survey 
report  to  be  readable  must  condense  and  summarize 
its  findings,  but  to  be  intelligible  must  give  in  full  the 
data  upon  which  its  conclusions  are  based.  Either 
course  alone  is  sure  to  lead  to  misconception.  Accord- 
ingly, the  writer  has  written  this  report  in  three  sections. 
Section  i  of  each  chapter  is  a  concise,  non-technical  discus- 
sion of  the  significant  aspects  of  the  data  secured  in  the 
present  attempt  to  evaluate  by  measurement  the  effects 
of  the  Gary  system  upon  the  teaching  of  the  fundamental 
branches  in  the  Gary  schools.  Section  2  contains  critical 
discussions  of  all  that  is  involved  in  the  testing  process — 
analyses  that  will  show  clearly  the  reservations  with 
which  any  set  of  conclusions  should  be  put  forward. 
Finally,  in  the  appendices  has  been  placed  material  of 
value  to  the  student  of  education  and  essential  to  a  care- 
ful study  of  the  report,  but  devoid  of  interest  except  to  the 
specialist.  This  material  includes  certain  of  the  longer  de- 
tailed tables,  directions  for  scoring  the  tests,  samples  of  rec- 
ord sheets,  score  cards,  and  other  items  of  a  similar  nature. 
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INTERPRETATION 


In  interpreting  the  meaning  of  results  of  tests,  three 
methods  of  procedure  are  possible.  One  method — and 
the  best,  for  it  is  the  least  ambiguous — is  to  state  what 
was  found,  leaving  it  to  the  reader  to  judge  for  himself 
whether  or  not  the  product  is  satisfactory.  The  second 
method  is  to  compare  the  results  obtained  with  those 
from  other  cities  where  similar  measurements  have  been 
made.  This  method  is  legitimate  only  when  the  two 
cities  have  been  measured  with  equal  care  and  under 
similar  conditions.  The  third  method  is  to  record  the 
investigator's  own  conclusions.  This  method  is  the 
most  dangerous,  for  it  is  difficult  to  free  opinion  from  the 
effects  of  personal  bias.  However,  in  this  report  all 
three  methods  have  been  followed.  In  succeeding 
chapters  the  direct  and  comparative  data  will  be  found 
in  detail.  In  the  final  chapter  the  author's  own  con- 
clusions from  these  data  are  presented. 


II.    TESTS  AND  TESTING  CONDITIONS 

THE  theory  of  the  modern  program  at  Gary,  aim- 
ing to  minister  adequately  to  every  need  of  the 
child — physical,  intellectual,  moral,  industrial, 
and  social — and  to  correlate  closely  school  activities  with 
those  of  real  life,  meets  with  general  approval;  but  the 
vital  question  to  be  answered  by  the  measurements  de- 
scribed in  this  volume  is:  What  effects  do  the  actual  ways 
in  which  the  Gary  schools  carry  out  their  program  have 
upon  certain  educational  products? 

THE  SCHOOLS  TESTED 

The  four  most  representative  schools  of  Gary,  named 
in  the  order  of  size  and  equipment,  are:  Froebel, 
Emerson,  Jefferson,  and  Beveridge.  The  Froebel  and 
Emerson  schools,  built  within  the  last  ten  years,  are 
both  architecturally  adapted  and  fully  equipped  to  carry 
out  an  extended  and  enriched  program.  The  Jefferson 
and  Beveridge  schools,  however,  were  in  existence  prior 
to  the  development  of  the  present  system.  In  both,  in 
spite  of  alterations,  the  attempt  to  carry  out  the  full  im- 
plication of  the  Gary  plan  is  handicapped  by  limitations 
of  equipment.   Nevertheless,  both  have  the  longer  school 
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day  and  the  new  type  of  organization,  and  in  both  the 
regular  academic  work  is  supplemented  by  such  additional 
special  work  as  their  facilities  permit.  Comparisons  from 
one  Gary  school  to  another  are  thus  possible  and  should 
serve  to  bring  to  light  any  effects  caused  by  differences 
in  equipment  or  organization. 

The  remaining  schools  are  small  in  size,  limited  in 
equipment,  and  at  such  distances  from  the  center  of  the 
town  that  it  seemed  better  to  confine  the  testing  to  the 
four  schools  mentioned.  Accordingly,  but  few  tests  were 
given  in  the  small  schools. 

SOCIAL  CONDITIONS 

The  sections  of  the  city  served  by  the  various  schools 
differ  greatly  in  social  conditions;  hence,  in  making 
school  to  school  comparisons  these  differences  should 
be  kept  in  mind.  The  Froebel  children  are  mainly 
of  foreign  parentage  and  often  come  from  homes  far 
down  on  the  social  scale.  As  a  result,  difficulty  with 
language  is  a  factor  which  enters  largely  into  all  work 
at  this  center.  The  Emerson  and  Jefferson  schools, 
however,  draw  from  the  better  residential  districts. 
Beveridge  is  in  an  older  and  poorer  section  of  the  city, 
and  its  children  differ  accordingly.  The  differences  in 
the  home  conditions  from  school  to  school  are,  therefore, 
marked. 

In  regard  to  the  related  question  as  to  the  composition 
of  the  population,  it  is  probable  that  the  foreign  born 
element  in  Gary  is  greater  than  in  most  American  cities 
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of  the  same  size,  but  probably  not  greater  than  in  other 
rapidly  growing  industrial  centers.1 

SYSTEM  OF  GRADING 

The  regular  school  year  in  Gary  is  ten  months,  usu- 
ally divided  into  three  equal  terms.  There  is  a  corres- 
ponding division  of  grades  into  A,  B,  and  C  groups.  Thus 
the  lowest  group  in  the  sixth  grade  is  known  as  6C, 
the  next  higher  as  6B,  and  the  highest  group  as  6 A. 
However,  all  three  divisions  are  often  found  in  the  same 
class  organization,  taking  the  same  work.2  But  class 
organization  is  not  as  stable  at  Gary  as  in  conventional 
systems.8  Children  tend  to  come  and  go,  and  the  mem- 
bership of  a  given  class  fluctuates  correspondingly.  In 
conducting  a  test  it  has  happened  that  the  grade  label 
given  by  the  principal  to  a  class  did  not  agree  with  that 
given  by  the  teacher,  and  sometimes  neither  was  the 
same  as  the  grades  written  by  the  children  on  the  test 
papers.  Under  the  circumstances,  it  was  decided  to 
give  to  each  class  the  official  grade  assigned  to  it  at  the 
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It  might  have  happened  that  this  plan  of  promotion  would  make 
comparisons  with  corresponding  grades  in  other  cities  unfair,  in  that  a 
third  of  the  children  would  have  been  in  a  grade  but  a  short  time  by  the 
end  of  the  year.  Careful  checking  of  the  data,  however,  proves  that 
owing  to  the  manner  of  grouping  children  in  classes,  this  is  not  the  case; 
the  special  condition  is  favorable  to  Gary  rather  than  otherwise. 

'The  various  recitation  groups  or  classes  are  numbered,  as  class  18, 
Jefferson.  Two  classes  of  the  same  grade  in  the  same  school  have  dif- 
ferent numbers.  Thus  class  45,  Froebel,  is  an  eighth  grade  class,  and 
4»U«R  46  is  also  an  eighth  grade  class. 
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close  of  school  before  promotions  in  June,  1916.  Class 
No.  45  Fiioebel,  for  instance,  will  be  called  an  eighth 
grade  class,  and  throughout  the  tables  that  follow  its 
scores  form  a  part  of  the  eighth  grade  group. 

ATTENDANCE 

Class  No.  45  was  a  persistent  unit  throughout  the 
testing  period  and  has  been  so  tabulated,  but  the  individ- 
uals comprising  class  No.  45  varied  more  or  less  with 
every  test.  The  actual  number  of  individuals  found 
in  this  class  on  the  20  different  occasions  on  which  it  was 
visited  from  March  23  to  June  9  varied  from  n  to  23 
(Figure  1).  While  this  particular  class  illustrates  ex- 
treme variation  above  and  below  the  official  class  mem- 
bership, similar  data  were  obtained  for  many  other 
classes.1 

The  irregularity  in  class  No.  45  Froebel  was  due  to 
many  causes;  some  of  the  variations  were  brought  about 
by  such  legitimate  factors  as  sickness  and  withdrawal 
from  school  work;  other  cases  represent  legitimate  varia- 
tions caused  by  adjustment  of  programs  to  individual 
needs  leading  to  attendance  on  other  than  official  classes. 
For  instance,  individual  "I"  (Figure  1)  on  April  11 
recited  for  some  reason  with  class  No.  46  instead  of 
with  class  No.  45  in  which  he  officially  belonged.     Simi- 


1The  results  for  the  other  4  eighth  grade  classes  (combined)  are: 
official  membership  128,  maximum  attendance  134,  minimum  91,  median 
119;  names  not  on  official  list,  11;  children  tested  twenty  times  in  the 
twenty  test  days,  34  per  cent.    See  also  Table  I,  Appendix  A. 
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Individuals  are  denoted  by  letters  in  the  column  at  the  left.  The 
dates  of  the  various  tests  are  given  at  the  top.  Each  small  rectangle 
represents  the  attendance  of  one  individual  for  one  day.  Black  means 
absence.     Abbreviations  have  the  following  meanings: 

T  means  tested  in  the  given  class. 

N.  T.  means  not  tested  in  the  given  class. 

46  means  class  No.  46,  also  eighth  grade  class. 

44  means  class  No.  44,  a  seventh  grade  class. 

43  means  class  No.  43,  a  seventh  grade  class. 

Withdrawn  means  left  school. 

Enrolled  or  not  enrolled  refers  to  the  "Official  lists." 

The  official  enrollment  for  class  No.  45  was  16,  the  median  attendance 
15,  the  maximum  attendance  33,  the  minimum  11,  the  number  of 
different  individuals  found  in  the  class  during  the  twenty  test  days,  36. 
The  superintendent  states  that  this  class  was  a  small,  irregular  group 
which  served  as  a  temporary  "catch  all"  for  the  grade.  However,  similar 
variations  were  found  in  other  classes  and  grades,  also. 
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larly,  individual  "Z"  was  regularly  tested  in  class  No.  43 
(7th  grade),  his  official  class,  but  twice  appeared  in  class 
No.  45  (8th  grade).  Individual  "B"  appears  on  the 
official  list1  of  class  No.  46  and  individual  "S"  in  class 
No.  44,  but  neither  is  found  in  the  records  of  the  tests 
of  these  classes.  Individuals  "C,"  "J,"  and  "L" 
represent  still  another  type  of  variation;  their  names 
do  not  appear  on  any  of  the  official  lists  of  any  classes 
yet  their  test  papers  are  on  file  as  proof  that  on  the  days 
mentioned  they  were  present  in  class  No.  45.  In  grades 
four  to  eight,  for  51  classes  with  an  average  membership 
of  31,  there  were  found  present  for  the  spelling  test  on 
May  2  and  3,  on  the  average,  1.7  names  per  class  which 
did  not  appear  on  the  official  enrollment.2  That  is, 
the  tests  were  given  to  continually  fluctuating  groups. 

No  complete  tabulation  of  the  exact  attendance  by 
individuals  in  each  test  and  class  was  made,  because  a 

Principals  were  asked  to  furnish  complete  lists  by  classes  of  all  the 
children  enrolled.  Previously,  children's  names  appearing  on  the 
teachers'  registers  had  been  copied  on  cards,  and  checked  against  the 
promotion  lists  for  grades  and  against  the  census  reports  for  age.  When 
the  lists  furnished  by  the  principals  had  been  checked  against  these 
cards,  they  were  adopted  as  "Official  Class  Lists"  and  are  so  referred 
to  throughout  this  report. 

Percentage  of  "extra"  pupils:  Froebel,  7.5%;  Jefferson,  6.5%; 
Emerson,  2%;  Beveridge,  1%.  There  is  some  evidence  tending  to  show 
that  the  "extra"  pupils  were  present  in  larger  numbers  when  the  testing 
work  was  new.  Whether  the  absent  and  extra  groups  represent  real 
conditions  or  simply  defects  in  the  "official  lists"  cannot  be  determined. 
The  "official  lists"  represent,  at  least,  the  best  that  could  be  done  with 
such  lists  as  the  principals  furnished. 
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study  of  attendance,  as  such  does  not  fall  within  the 
scope  of  this  report;  but  plenty  of  incidental  evidence 
that  Figure  i  actually  reflects  conditions  throughout  all 
grades  of  the  school  during  the  testing  periods  has  been 
accumulated  during  the  process  of  checking  records. 
For  instance,  the  actual  number  of  children  tested  on 
the  19  test  days  varied  from  68  per  cent,  to  90  per  cent, 
(average  83  per  cent.)  (Table  I,  Figure  2).  The  writer 
estimates  that  about  80  children  out  of  100  enrolled 
were  tested  regularly,1  that  the  remaining  children  varied 
from  day  to  day.  As  part  of  either  the  constant  or 
fluctuating  group,  there  were  approximately  5  per  cent, 
of  the  children  who  were  either  not  enrolled  at  all,  or  who, 
from  causes  legitimate  or  otherwise,  recited  from  day 
to  day  with  classes  other  than  their  own.  On  the  aver- 
age, therefore,  a  single  test  measures  only  83  per  cent, 
of  the  total  number  of  children  enrolled. 

COURSES  OF  STUDY,   TIME  ALLOTMENTS 

An  important  series  of  facts  bearing  directly  upon  the 
interpretation  of  the  results  of  tests  are  those  connected 
with  the  amount  and  character  of  the  instruction,  with 

xThat  is,  three  days  out  of  four.  For  instance,  the  four  Trabue  Lan- 
guage Scales  were  given  at  Gary  as  follows:  B,  April  13;  D,  April  14; 
E,  May  16;  C,  May  29.  A  class  selected  at  random  from  the  Emerson 
school  proved  to  be  class  No.  12,  6th  grade.  From  the  tests,  43  names 
were  secured  (official  membership,  38) ;  49  per  cent,  of  these  children 
were  present  for  all  four  tests,  26  per  cent,  for  three  tests,  n  per  cent,  for 
two  tests,  and  14  per  cent,  for  one  test  only.  Approximately  55  per  cent. 
of  the  official  membership  were  present  for  all  four  tests. 
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Figure  2 
Percentage  op  Attendance  at  Time  of  Tests  from  March  to  June 
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The  scale  along  the  base  of  the  figure  is  the  time  scale.  The  scale 
along  the  vertical  axis  represents  the  per  cent  the  actual  attendance  is 
of  the  official  enrollment.  The  solid  line  represents  the  results  obtained. 
The  dotted  line  is  the  generalized1  curve  of  attendance.  The  general 
percentage  of  attendance  indicated  by  the  dotted  line  is  85%. 

The  reader  should  note  that  in  the  time  scale  a  week  during  which 
there  was  no  school  has  been  omitted. 


>See  XI  of  Appendix  A,  page  474. 
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the  amount  and  conditions  of  study,  and  with  the  relation 
of  the  special  work  to  the  academic  training.  For  de- 
tailed information  on  these  points  the  reader  is  referred 
to  the  other  survey  reports. 

TESTING  PROGRAM1 

The  testing  period  extended  from  the  third  week  in 
March  to  the  end  of  the  second  week  in  June  (Figure  2). 
The  subjects  covered  were  reading,  writing,  arithmetic, 
English  composition,  and  spelling.  Counting  each 
separate  test  and  each  repetition  of  the  same  as  one, 
the  total  number  of  tests  given  was  55,  and  the  total 
number  of  papers  scored  and  tabulated  69,282.  With 
one  or  two  minor  exceptions,  only  well  established  stand- 
ard tests  were  used  (Figure  3)  and  these  "only  in  the 
fundamental  subjects  taught  in  the  elementary  grades. 

TESTING  CONDITIONS 

The  effort  was  made  to  complete  in  one  day  the  giving 
of  each  test  to  the  entire  city.  Owing  to  the  department- 
alization of  school  work,  however,  and  the  lack  of  facili- 
ties for  testing  large  groups  at  one  time,  it  was  necessary 
to  reach  classes  in  particular  rooms  where  conditions 
were  suitable,  so  that  in  some  cases  a  few  of  the  classes 
had  to  be  tested  on  the  day  following  the  general  test. 
The  children  were  not  all  tested  at  the  same  hour  of  the 


xSee  Table  II,  page  393,  of  Appendix  A  for  a  complete  statement 
of  the  days  upon  which  tests  were  given. 
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day,  but  experimental  investigations1  seem  to  show  that 
the  time  of  day  is  not  a  factor  in  determining  results. 
The  fatigue  arising  from  routine  work  disappears  before 

Figure  3 

Tests  Given  at  Gary 

Reading  Writing  Arithmetic 

Oral  Cleveland  Series  B 

Gray  Free-choice  Four  Operations 

Silent  Dictation 
Kansas  Cleveland 

Courtis  Composition  Multiplication 

Trabue  Fractions 

English  Composition  Spelling 

Original  Story  Cleveland  List  Tests 

Dictation  Tests 
Reproduction  of  Story  Composition  Test 

Total  ss  Tests — 69,282  Papers. 

the  stimulus  of  a  change  of  work  and  the  new  situation. 
At  any  rate  the  Gary  results  should  not  be  affected  by 
this  factor  more  than  the  results  from  tests  in  other 
school  systems.  In  general,  the  tests  were  given  on 
Tuesdays  and  Thursdays,  although  a  few  exceptions 
occurred.  Any  one  class  within  the  grades  tested  was 
visited  from  19  to  23  times.2 

The  conditions  as  to  light,  heat,  materials,  etc.,  under 
which  the  tests  are  given  constitute  an  important  group 
of  factors.  These  are  only  partially  under  control  and 
differ  greatly  from  day  to  day.  However,  they  differ 
no  more  for  the  testing  work  than  for  the  regular  school 
work.  The  tests  were  given  in  regular  classrooms,  the 
children  used  their  usual  pens  and  ink,  or  pencils,  and 

!See :  Heck,  W.  H.,  Journal  of  Educational  Psychology,Vol  .V,  page  92. 
*Sce  discussion  of  Table  II,  Appendix  A,  page  394. 


..,  ^^iciLuiar  care  was  t 
during  periods  when  they  wert 
demic  work,  that  there  might  1 
possible  in  changing  from  one  1 
That  is,  the  children  were  not 
ground  or  from  their  shop  worl 
suits  secured  cannot  justly  be 
conditions  of  this  character. 

The  manner  in  which  tests  are 
mining  the  degree  of  response  ma 
an  exceedingly  difficult  factor  to  c 
for  the  first  time  by  a  stranger  ; 
nervous  panic  which  absolutely 
sponse  to  the  test  situation,  al 
number  of  such  cases  does  not  exc 
group  at  most.  Again,  the  exigent 
require  adjustment  of  instructions ; 
varying  conditions  which  arise  in  cl 
and  at  different  hours  of  the  day, 
were  given  by  the  author  and 
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all  of  whom  are  men  professionally  interested  in  meas- 
urement and  experienced  in  giving  tests  to  school  chil- 
dren. For  some  of  the  tests,  however,  it  was  neces- 
sary to  use  many  more  than  four  examiners.  Through 
the  kindness  of  Professor  Judd,  graduate  students  were 
secured  as  needed  from  the  classes  in  the  University  of 
Chicago.  These  were  given  the  necessary  specific 
training  the  day  before  they  were  used  as  examiners. 
Every  effort  was  made  through  frequent  conferences  and 
direct  training  to  keep  conditions  uniform. 

Particular  care  was  given  to  the  timing.  Whenever 
possible  the  examiners  used  automatic  timers,  consisting 
of  a  clock  with  electrical  connections  so  arranged  that 
it  could  be  set  to  give  automatically  the  starting  and 
stopping  signals.  For  long  or  very  short  intervals,  stop- 
watches and  foot-ball  timers  were  used.  In  many  cases 
the  teacher  was  given  a  timer  also  and  asked  to  check 
the  examiner's  timing.  The  variations  noted  were  small 
and  often  due  to  the  difference  in  the  reaction  times  of 
teachers  and  examiners.  For  the  most  part,  the  timing 
was  satisfactorily  done  and  the  errors  kept  within  2  per 
cent,  of  the  total  time  interval.  Only  one  gross  error  in 
timing  was  discovered  and  the  results  for  that  test  and 
class  were  rejected.  Variation  in  timing  as  an  explana- 
tion of  scores  was  thus  reduced  to  a  negligible  factor. 

Previous  to  the  survey,  very  little  measurement  work 
had  been  done  in  the  Gary  schools.  To  accustom  the 
children  to  the  taking  of  tests  and  to  the  examiners,  as 
well  as  to  give  the  members  of  the  staff  an  opportunity 


wn.il  other's  papers.     The  second 
same  test  were  given,  one  after  ti 
day's  work  began  with  a  test  in  cai 
the  fifth  trial  of  the  test  in  copyii 
then  a  second  trial  of  the  test  in 
By  this  time  the  children  understa 
work,  were  fully  adjusted  to  such 
together,   turning  over  papers  rap 
promptly  on  signal.     It  may  be  sai< 
suits  are  too  high  because  of  this  i 
but  no  part  of  low  scores  can  justl 
undue  nervousness  at  being  timed,  c 
the  tests  themselves. 

One  factor,  the  effect  of  which  it  ii 
ate,  is  the  disturbance  caused  by 
Teachers  were  subject  to  inspectio 
naires,  etc.,  for  a  period  of  several  m< 
periences  are  not  conducive  to  who! 
effort.  As  far  as  the  testing  work  i 
the  disturbance  and  loss  of  time  were « 
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more  than  offset  by  the  stimulating  effects  of  repeated 
measurement  upon  both  teachers  and  children. 

The  children  seemed  to  enjoy  the  tests  and  frequently 
expressed  the  confident  opinion  that  they  had  done  well. 
The  teachers  and  principals  were  interested  also,  and 
from  the  superintendent  down  to  the  children  themselves 
there  was  full  cooperation.  There  is  every  reason  to 
believe,  therefore,  that  the  results  secured  represent  fairly 
the  work  of  the  children,  and,  subject  to  the  general  quali- 
fications to  be  made  in  Chapter  VIII,  constitute  a  fair 
measure  of  the  children's  abilities  at  the  time  the  survey 
was  made. 

SCORING 

Whenever  possible,  the  tests  were  scored  by  the 
children  in  duplicate  (that  is,  by  two  individuals)  and 
later  the  scoring  was  checked  by  the  examiners.  It 
should  be  particularly  noted,  however,  that  in  every 
case  the  original  work  was  unmarked  by  the  children, 
the  scoring  being  done  upon  specially  prepared  answer 
cards.  Most  of  the  original  material  is  still  on  file,  un- 
marked, just  as  it  came  from  the  children.  This  made 
repeated  scoring  possible  so  that  errors  caused  by  faulty 
work  have  been  almost  completely  eliminated.1     Wher- 

samc  period,  165  hours  (55  days,  3  hours  a  day)  were  allotted  to  regular 
work  in  the  subjects  tested.  That  is,  the  testing  work  at  most  decreased 
the  regular  classroom  instruction  by  6  per  cent,  during  the  actual  time 
the  tests  were  being  given  (n  weeks),  and  by  less  than  2  per  cent,  if 
the  entire  year  is  taken  as  a  base. 

lIn  spite  of  every  precaution,  a  few  minor  errors  are  discovered  at  each 
rereading  of  data  or  proof. 


26  THE  GARY  SCHOOLS 

ever  the  scoring  involved  more  judgment  than  merely 
checking  answers  right  or  wrong,  it  was  done  entirely  by 
members  of  the  staff,  and  then  often  only  after  suitable 
training  on  standardized  material.  Special  care  was 
taken  in  scoring  all  eighth  grade  papers.  The  data  given 
in  the  tables  which  follow  may,  therefore,  be  depended 
upon  to  represent  correctly  the  actual  results  secured. 

TABULATIONS 

Tabulation  was  carried  on  by  paid,  specially  trained 
assistants.  For  the  most  part,  these  were  students  of 
the  Detroit  Normal  School,  members  of  the  author's 
own  classes  in  educational  measurement.  Every  care 
was  exercised  to  check  each  result.  In  the  general  tables, 
however,  fractions  and  small  irregularities  have  been 
ignored.  The  results  are  correct  only  to  the  nearest 
tenth  of  an  example,  or  to  the  nearest  whole  per  cent. 
For  the  general  discussions,  curves  have  been  smoothed, 
approximations  used,  and  conclusions  drawn  from  general 
tendencies  rather  than  from  minor  irregularities.  How- 
ever, in  the  technical  discussions,  precise  and  detailed 
information  is  also  given. 

CONCLUSION 

From  the  foregoing  paragraphs  it  should  be  evident 
that  the  work  of  the  most  representative  Gary  schools 
was  measured  with  due  regard  to  proper  control  of  es- 
sential conditions,  and  that  equal  care  has  been  taken 
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in  scoring  and  tabulating  the  results.1  In  the  succeeding 
chapters,  in  which  the  data  are  presented,  little  further 
reference  to  these  phases  of  the  testing  will  be  made. 

'For  a  full  discussion  of  this  topic  see  Part  II  of  the  Seventeenth  Year- 
book of  the  National  Society  for  the  Study  of  Education  (19x8),  par- 
ticularly Chap,  n. 


m.    HANDWRITING 

§i.  General  Results 

HANDWRITING  has  long  held  a  prominent  place 
in  American  schools.  At  Gary  the  annual  time 
allotment  is  329  hours,  or  7  per  cent,  of  the  total 
time  given  to  the  fundamental  subjects.  For  fifty  Ameri- 
can cities  the  corresponding  average  allotment  is  388 
hours,  which  is  also  7  per  cent,  of  the  total.1  Gary, 
therefore,  is  typically  American  in  the  emphasis  put 
upon  this  school  art. 

SECURING  AND  SCORING  SAMPLES 

Samples  of  children's  handwriting  in  grades  2  to  12 
were  secured  in  three  different  ways.  The  tests  used 
were  the  Cleveland  Free  Choice  Test,  the  Courtis  Dicta- 
tion Tests,  and  the  Composition  Test.2  These  were  all 
given  by  the  special  examiners.  The  teachers  took  no 
part  in  the  testing,  although  they  were  present  in  the 
rooms  at  the  time  the  tests  were  given. 

The  various  samples  were  measured  as  to  the  two  most 
fundamental  characteristics  of  handwriting:  rate,  or  num- 
ber of  letters  written  per  minute;  and  quality  (general 
merit)  as  determined  by  comparison  with  the  Ayres  Hand- 
writing Scale.    However,  to  free  the  results  from  any  pos- 

^ee  The  Gary  Schools:    A  General  Account. 
sFor  the  meaning  of  these  terms,  see  pages  48  to   53  of  this  book. 
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able  question  as  to  the  reliability  of  such  scoring  they  will 
be  presented  first  by  means  of  representative  samples. 
Sample  "A"  (Figure  4,  facing  page  30)  represents  the 
characteristic  end  product  of  the  training  in  the  elemen- 
tary grades.  It  is  the  writing  of  an  eighth  grade  child  in 
the  free  choice  test,  was  written  at  the  rate  of  122  letters 
per  minute,  and  is  judged  to  be  equal  to  quality  45  on  the 
Ayres  Scale.  It  represents  approximately  the  median 
score1  made  by  the  entire  eighth  grade  group  (generalized 
eighth  grade  city  wide  score:  free  choice  test,  44  Ayres; 
composition,  42  Ayres;  dictation,  39  Ayres)2;  that  is, 
about  half  the  eighth  grade  children  at  Gary  wrote  as 
well  as,  or  better  than,  this  sample,  half  wrote  as  poorly 
as,  or  worse  than,  this  sample.  The  reader  thus  has  the 
opportunity  of  judging  for  himself  whether  or  not  this 
performance  under  the  test  conditions  represents  a  satis- 
factory result  of  eight  years'  training  in  writing. 

tThe  median  is  the  mid-score,  a  score  such  that  there  are  as  many 
scores  the  same  or  larger  as  there  are  the  same  or  smaller.  More  pre- 
cisely, it  is  "  that  point  on  the  scale  of  the  frequency  distribution  on  each 
side  of  which  one-half  of  the  measures  fall."     (Rugg.) 

An  approximate  method  has  been  used  in  computing  the  mediant 
this  report.    This  method  yields  correct  results  when  the  total  n 
of  scores  is  even;  but  when  the  total  number  of  scores  is  odd,  the 
is  in  error  (too  large)  by  -^  th  of  a  step,  when  "n"  represents  the  fre- 
quency in  which  the  median  falls. 

Throughout  the  report  the  non-technical  reader  may  read  " average" 
for  "median"  without  serious  error  and  with  no  change  in  the  general 
thought  expressed. 

*For  actual  median  quality,  see  Table  XV,  page  76.  The  best  score 
made  by  any  eighth  grade  class  in  any  writing  test  was  48  Ayres, 
the  lowest,  35  Ayres. 
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Two  other  examples,  also  from  the  free  choice  test, 
will  serve  to  define  further  the  quality  of  writing  (Figure 
5,  page  31).  Sample  "B"  represents  writing  of  the 
quality  (55  Ayres)  that  is  equaled  or  exceeded  by  but 
12  per  cent,  of  the  eighth  grade  children,  while  sample 
"C"  represents  the  quality  of  writing  (30  Ayres)  which 
is  equaled  or  exceeded  by  94  per  cent,  of  the  eighth  grade 
children.  In  other  words,  most  (82  per  cent.)  of  the 
eighth  grade  writing  falls  between  qualities  "B"  and 
"C."  If  the  values  were  to  be  based  on  any  one  test, 
the  quality  of  the  samples  would  need  to  be  changed 
somewhat,  but  in  no  case  would  the  change  amount  to 
more  than  half  a  step  on  the  Ayres  Scale. 

The  same  samples  may  be  used  to  extend  the  illustra- 
tion to  other  grades.  For  instance,  in  the  composition 
test1  the  writing  of  but  one  twelfth  grade  student  in  four 
equals  or  exceeds  the  quality  of  sample  "B,"  while  the 
writing  of  but  one  twelfth  grade  student  in  twenty  is  as 
poor  as,  or  worse  than,  sample  "  C. "  That  is,  approxi- 
mately 75  per  cent,  of  the  twelfth  grade  writing  falls 
between  samples  "B"  and  "C."  In  similar  fashion, 
approximately  45  per  cent,  of  the  fourth  grade 
writing  falls  between  samples  "A"  and  "C."  Half 
of  the  fourth  grade  writing  is  worse  in  quality  than  sam- 
ple "C." 

The  generalized  city  wide  median  scores2  for  both 


!See  page  76. 

sFor  explanation  of  the  sense  in  which  these  terms  are  used  se«  XI 
of  Appendix  A,  page  472. 
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Figure  6 
Rate  and  Quality  Scores  in  the  Three  Handwriting  Tests1 
quality  AYtts  HANDWRITING 

60 


50 
40 

30 
Z0 
10 


"composition  FR£E 

f        ^2- -^CHOICE 
,  ^  -^  y„*  DICTATION 


RATE 


o      io     to     jo     40     jo     60      70     so      to     too    no     izo     ija 

UTTCRS    PI*  fllKUTt 

The  scale  along  the  base  of  the  figure  represents  rate  of  writing,  or 
letters  written  per  minute.  The  scale  along  the  vertical  axis  represents 
quality  on  the  Ayres  Scale.  The  heavy  solid  line  represents  median 
scores  in  the  composition  test;  the  heavy  broken  line,  the  scores  in  the 
free  choice  test;  the  dotted  line  represents  scores  in  the  dictation  test. 
The  positions  of  the  various  grade  medians  are  indicated  by  figures  along 
the  curves. 

The  graph  shows  that  the  free  choice  and  dictation  tests  agree  closely 
in  both  rate  and  quality;  that  the  composition  test  was  written  at  a  much 
lower  rate  and  with  somewhat  higher  quality  than  the  other  tests. 

rate  and  quality  of  writing  in  all  three  tests  (Table  II, 
page  33,  Figure  6,  above)  show  that  the  quality  of 
writing  at  Gary  varies  from  32  Ayres  in  the  fourth  grade 
to  a  maximum  quality  of  51  Ayres  in  the  twelfth  grade. 
The  progress  in  rate  of  writing  is  marked  in  the  free- 
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The  scale  along  the  base  of  the  figure  represents  rate  of  writing,  or 
letters  written  per  minute.  The  scale  along  the  vertical  axis  represents 
quality  on  the  Ayres  Scale.  The  heavy  solid  line  represents  median 
scores  in  the  composition  test;  the  heavy  broken  line,  the  scores  in  the 
free  choice  test;  the  dotted  line  represents  scores  in  the  dictation  test. 
The  positions  of  the  various  grade  medians  are  indicated  by  figures  along 
the  curves. 

The  graph  shows  that  the  free  choice  and  dictation  tests  agree  closely 
in  both  rate  and  quality;  that  the  composition  test  was  written  at  a  much 
lower  rate  and  with  somewhat  higher  quality  than  the  other  tests. 

rate  and  quality  of  writing  in  all  three  tests  (Table  II, 
page  33,  Figure  6,  above)  show  that  the  quality  of 
writing  at  Gary  varies  from  32  Ayres  in  the  fourth  grade 
to  a  maximum  quality  of  51  Ayres  in  the  twelfth  grade. 
The  progress  in  rate  of  writing  is  marked  in  the  free- 

.  *See  Table  II,  page  33. 


The  sample 
of  handwriting 
the  free  choice  test, 
be  quality  45  Ayres, 
eat  judges  were  40, 
given  the  free 
both  of  which  recex 


ual  received  a  score  of  55  Ayres,  in  the  corn- 
group  of  130  eighth  grade  papers  in  this  test, 
ie  same  mark  (45  Ayres)  and  29  which  were 
papers  were  marked  35  Ayres  or  lower,  ax 
',  basing  conclusions  on  all  the  writing  tests  in 
le  above  represents  very  closely  the  median 
instruction  at  Gary. 


HANDWRITING 


S8SSS5SS3933S 


SSSSS8883SS3 


33385333358 


SSSSSIKgigg 


Kri^-wiomtoSS 


Ills 


1  fjllii 
hi  lift1  1 

MHHr 


THE  GARY  SCHOOLS 


»    I       s       t    g    o 

t-    E       ><       ><    >-    >« 


I! 


S      3    j   a      g       IBS 

S       S    8    8      S       13    3 


sis    s    s  e  i 


3  1 
§  1 


1  s  ! 


sjja-s 

Du    «        k        K 


1  1  I 
I  J  3 


3  a   S 


.  1 1  ■'*$ 

O     S     Q 


HANDWRITING 


j 

I 

on   |s 


1       Sg    S3 

1       3131 

g 

Si  1    3  1 

*  .    . 

- 

• 

B 

3%    SS3 

C        JS 

II  1| 


6   c      (23   S 


! 

'I 
I 


I! 


36  THE  GARY  SCHOOLS 

choice   test,   but   the   improvement   in   quality  from 
grade  to  grade  is  small  in  all  tests. 

The  results  of  the  three  tests  are  consistent  and  show 
only  such  differences  as  are  to  be  expected.  In  the 
elementary  grades  the  qualities  agree  very  closely.  The 
quality  in  the  eighth  grade  dictation  test  is  lower  than 
in  the  other  tests,  but  this  is  undoubtedly  due  to 
the  fact  that  attention  was  being  given  to  spell- 
ing what  were,  for  the  Gary  children,  difficult  words 
(average  accuracy  of  spelling  55  per  cent.).  The 
free  choice  results  in  most  grades  are  a  little  better 
in  quality,  probably  because  under  the  conditions 
of  the  test  the  children  had  had  a  day's  prepara- 
tion in  writing  the  test  sentences.  In  grades  ten, 
eleven,  and  twelve  the  quality  of  writing  in  the  com- 
position test  is  slightly  higher  than  in  the  other  tests, 
but  the  rate  is  very  much  lower;  also  the  students  in 
the  last  year  of  the  high  school  are  a  very  select  group, 
owing  to  the  eliminations  in  the  previous  grades.  How- 
ever, the  maximum  differences  are  not  large  and  the 
best  quality  reached  in  any  test  by  any  grade  is  not  high. 

DESIRABLE  GOALS  IN  HANDWRITING 

The  conclusion  that  the  writing  of  the  Gary  children 
is  poor  may  seem  unwarranted  to  certain  persons.  They 
may  take  the  position  that  the  Gary  children  write  well 
enough;  that  the  quality  of  writing  shown  in  sample 
"A"  is  satisfactory;  that  the  conventional  school  over- 
emphasizes the  subject.    Table  III  taken  from  the  Four- 
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teenth  Yearbook  of  the  National  Society  for  the  Study 
of  Education  is  a  reply  to  this  criticism.  It  represents 
the  judgments  of  a  number  of  business  houses  in  regard 
to  the  quality  of  writing  that  is  essential  The  standard 
quality  for  the  eighth  grade  product  as  set  by  Freeman 
as  a  result  of  the  investigation  is  quality  70  on  the  Ayres 
Scale  (Figure  7). 

COMPARATIVE  DATA1 

In  Gary  results  as  a  whole  are  rather  higher  in  rate 
and  much  lower  in  quality  than  the  results  reported 
by  other  surveys  (Table  IV,  Figures  8  and  9).  For 
instance,  the  Gary  eighth  grade  score  for  quality,  43 
Ayres,  is  12  points  lower  than  Cleveland  (55),  14  points 
lower  than  Starch's  Standard  (57),  20  points  lower  than 
Freeman's  results  (63),  and  29  points  lower  than  St. 
Louis  (72).    The  differences  in  the  third  grade  are  often 

*In  survey  work  the  temptation  is  great  to  compare  results  from  city 
to  city  as  if  they  were  secured  under  identical  conditions,  but  the  meth- 
ods of  measurement  are  so  new,  and  the  factors  to  be  controlled  so 
many,  that  in  spite  of  the  recognized  ability  of  the  men  engaged  in 
survey  work,  variations  in  conditions  are  bound  to  occur.  As  shown  in 
a  later  chapter,  the  effect  of  lack  of  knowledge  of  conditions  Is  so  serious 
that  those  making  comparisons  should  be  conscious  of  the  danger  of 
erroneous  inferences.  Nevertheless,  whenever  possible,  in  this  report  the 
Gary  results  are  submitted  with  tables  of  similar  results  from  other 
cities,  and  every  effort  has  been  made  to  array  the  data  in  such  form 
that  persons  wishing  to  make  comparisons  may  do  so  with  the  least 
danger  of  misrepresentation. 

Those  who  make  comparisons  should  always  remember  that  while 
tests  reveal  differences  in  achievement,  they  do  not  in  any  way  reveal 
the  causes  of  the  differences. 
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Figure  8 
Comparative  Gary  Scores  in  the  Free  Choice  Test 
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The  scale  along  the  base  of  the  figure  shows  rate  or  number  of  letters 
written  per  minute.  The  scale  along  the  vertical  axis  represents  quality 
on  the  Ayres  Scale.  The  heavy  solid  line  represents  Gary  scores  in  the 
free  choice  test.  The  light  line  represents  average  scores  of  fifty-six 
American  cities.  For  both  curves  the  positions  of  the  grade  medians 
are  indicated  by  figures. 

The  graph,  as  a  whole,  shows  that  the  Gary  scores  are  much  higher  in 
rate  and  much  lower  in  quality  than  the  average  of  fifty-six  American 
cities  (Freeman). 

small  (Gary  30.8,  St.  Louis  31.3),  but  the  rate  of  progress 
at  Gary  is  apparently  much  less  than  in  other  cities. 
This  is  offset  in  part  by  the  higher  rate  of  writing  at 
Gary.  The  data  tend  to  show,  therefore,  that  the  Gary 
children  wrote  freely,  paying  little  attention  to  the 
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Toe  scale  along  the  base  of  the  figure  represents  rate,  or  number  of 
letters  written  per  minute.  The  scale  along  the  vertical  axis  represents 
quality  on  the  Ayres  Scale.  The  heavy  solid  line  shows  Gary  results  in 
the  composition  test;  the  heavy  broken  line  indicates  free  choice  test 
The  light  solid  line  represents  the  Grand  Rapids  results  in  the  composi- 
tion test,  light  broken  line  in  the  free  choice  test  On  all  the  curves 
positions  of  the  various  grade  medians  are  indicated  by  figures. 

Curves  show  that  the  fifth  grade  in  the  free  choice  test  at  Grand 
Rapids  wrote  at  about  the  rate  and  with  about  the  quality  of  the  twelfth 
grade  in  the  composition  test  at  Gary.  For  both  Gary  and  Grand 
Rapids  there  are  only  slight  differences  in  quality  in  the  free  choice 
and  composition  tests.  The  Grand  Rapids  rate  in  the  composition 
test  is  probably  not  comparable  with  the  Gary  rate,  as  at  Grand  Rapids 
the  writing  of  the  compositions  was  timed  only  in  the  most  general  way,' 
while  at  Gary  the  time  each  composition  was  finished  was  noted.  How- 
ever, it  is  probable  that  the  only  effect  of  this  difference  would  be  to  shift 
the  position  of  the  curve  with  reference  to  the  rate  axis,  not  to  change 
its  character.  For  a  statement  in  regard  to  the  conditions  under  which 
the  Grand  Rapids  tests  were  made  see  Table  IV. 
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quality  of  their  work.  However,  the  rate  of  writing  at 
Gary  corresponds  closely  to  the  results  that  have  been 
obtained  in  other  cities  where  children  were  writing 
freely  and  did  not  know  that  quality  of  handwriting  was 
to  be  considered.1 

For  the  free  choice  test  the  papers  were  sorted  by 
grades  for  quality  and  the  average  rate  found  for  each 
quality,  as  was  done  in  Cleveland.  For  instance,  in  the 
fifth  grade  at  Gary,  128  papers  of  quality  20  averaged 
in  rate  58  letters  per  minute,  while  14  papers  of  quality 
50  averaged  62  letters  per  minute,  an  increase  of  4  letters 
per  minute.  But  at  Cleveland  the  papers  of  quality 
20  averaged  73  letters  per  minute,  while  the  papers  of 
quality  50  averaged  57  letters  per  minute,  a  decrease 
of  16  letters  per  minute  (Table  V,  page  43,  Figure  10, 
page  44).  In  other  words,  in  Cleveland  the  poor  writers 
wrote  rapidly  and  the  good  writers  slowly,  but  at  Gary, 
except  for  the  very,  very  poor  writers  who  wrote  very 
slowly,  the  good  and  poor  writers  wrote  at  about  the 
same  rate.  The  difference  is  probably  due  to  the  differ- 
ence in  the  effect  of  the  training  in  the  two  cities.2 

SCHOOL  TO  SCHOOL  COMPARISONS 

In  the  dictation  test,  Froebel  has  7  classes  markedly 
above  the  city  average  and  4  markedly  below.  For 
the  composition  and  free  choice  tests  the  figures  are  1 

fourteenth  Yearbook,  National  Society  for  the  Study  of  Education, 
Part  I,  pages  56  and  7a 

*See  also  pages  252-253,  Survey  of  St.  Louis  Public  Schools.    The  St 
Louis  curves  resemble  Gary's,  not  Cleveland's,  but  at  a  lower  rate  leveL 
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above,  6  below,  and  2  above,  7  below  respectively.  For 
the  Emerson  school  the  figures  are  2  above,  5  below; 
2  above,  1  below;  2  above,  2  below  (Table  VI,  page  45). 
The  other  schools  show  similar  variations.  The  conclusion 
to  be  drawn  is  that  there  is  almost  no  trace  of  constant 
differences  from  school  to  school.  The  differences  in  the 
organization  and  administration  of  the  four  schools  and 

TABLE  V 


Rate — Quality 

Development  in  Gajly  and  in  Cleveland 

AVERAGE  RATE 

FOR  VARIOUS  QUALITIES 

QUALITIES 

NO.  OF 
CASES 

Firm  Gradb 

No.  or 
Cases 

Eighth  Giadb 

GAIT 

CLEVELAND 

GARY 

CLEVELAND 

10* 

9 

62 

20 

128 

68 

78 

10 

78 

97 

80 

128 

64 

66 

89 

94 

88 

40 

61 

66 

68 

68 

93 

86 

60 

14 

62* 

67 

23 

92 

81 

60 

6 

68 

67 

7 

93 

78 

70 

1 

46 

67 

78 

80 

64 

76 

90 

61 

71 

tin  this  report,  the  range  of  scores  in  each  interval  of  a  distribution  is  from  the 
value  given  in  the  table  up  to  but  not  including  the  next  higher  value.  For  instance, 
10  in  this  table  indicates  a  range  of  scores  from  quality  10.0  to  and  including  quality  10.00+, 
but  not  quality  S0.0. 

•Average  6s  if  one  extremely  low  score  is  ignored. 

The  table  is  to  be  read  as  follows:  Of  nine  individual  records 
at  Gary,  whose  quality  of  writing  ranged  from  10  to  19  on  Ayres*  Scale, 
the  average  rate  of  writing  was  62  letters  per  minute.  Of  128  cases 
in  Gary  of  quality  20  to  29  the  average  rate  of  writing  was  58  letters 
per  minute,  while  at  Cleveland  the  rate  for  samples  of  the  same  quality 
was  73  letters  per  minute.  The  number  of  cases  in  Cleveland  is  not 
known.    Other  results  in  the  table  are  to  be  read  in  similar  fashion. 
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Rates  of  Writing  in  Gary  and  in  Cleveland1 
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The  horizontal  axis  represents  quality.  The  vertical  axis  represents 
rate.  Position  of  circles  represent  both  rate  and  quality.  The  number 
of  letters  written  per  minute  for  each  quality  is  given  in  the  circles.  Solid 
line  represents  eighth  grade,  dotted  line  represents  fifth  grade.  Lines 
marked  "G"  represent  Gary  scores,  and  those  marked  "C"  represent 
Cleveland  scores. 

Inferences:  In  Geveland  there  is  an  inverse  relation  between  rate 
and  quality.  The  writing  of  the  children  who  had  the  highest  rate  was 
the  poorest,  while  those  who  wrote  most  slowly  had  the  highest  quality. 
At  Gary,  except  for  very  low  qualities,  the  rate  of  writing  was  the  same 
for  all  qualities.  The  children  who  wrote  most  slowly  at  Gary  had  also 
the  poorest  writing. 


in  the  social  conditions  of  their  pupils  are  not  reflected 
in  any  positive  fashion  in  the  results  of  the  writing  tests. 


'See  Table  V,  page  43. 
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CONCLUSION 

The  results  reported  above  indicate  consistently  that 
handwriting  instruction  at  Gary  is  producing  very  small 
effect  upon  the  product.  The  improvement  in  quality 
is  small  and  does  not  keep  pace  with  the  change  in  the 
rate  of  writing. 

§2.     Critical  Discussion 
characteristics  of  handwriting 

Development  of  skill  in  handwriting  is  essentially 
the  development  of  a  motor  habit.  However,  good 
writing  is  dependent  not  alone  on  the  perfection  of 
motor  habits,  but  on  the  harmony  between  the  visual, 
movement,  pressure,  and  thought  "controls"  which 
keep  the  writing  process  going  and  direct  it. 

In  spite  of  the  complexity  of  writing  ability,  perform- 
ance in  a  writing  test  furnishes  a  simple  record  which 
is  definitely  objective  and  easily  measurable.  This 
measurement  may  follow  either  of  two  lines,  measure- 
ment of  gross  characteristics  only,  or  measurement  of 
analytical  details.  The  gross  characteristics  are  rate 
(number  of  letters  written  per  unit  of  time)  and  quality 
(or  goodness,  general  merit,  etc.).  For  survey  purposes 
measurement  of  gross  characteristics  alone  is  of  impor- 
tance, and  only  such  was  made  in  this  case. 

As  the  function  of  writing  is  to  record  thought  in 
such  a  form  that  it  can  be  easily  understood  by  others, 
legibility  is  sometimes  given  as  a  quality  to  be  measured. 
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Legibility,  however,  is  itself  complex,  being  dependent 
upon  the  relative  excellence  of  form,  alignment,  and  other 
characteristics.  In  comparing  two  samples  it  is  quite 
impossible  to  decide  on  the  basis  of  subjective  judgment 
alone  whether  or  not  one  sample  is  more  legible  than  the 
other.  A  real  measure  of  legibility  requires  accurate 
measurement  of  the  time  taken  to  read  a  sample  under 
carefully  controlled  conditions.  On  the  other  hand,  any 
and  every  sample  produces  on  a  reader  an  impression  of 
goodness  or  badness  into  which  the  many  particular  im- 
pressions blend.  Accordingly,  the  expression  "  quality " 
or  "general  merit"  will  be  used  in  place  of  legibility. 
That  is,  it  is  believed  that  whether  measurement  is 
made  by  the  Thorndike  or  by  the  Ayres  Scale,  comparison 
proceeds  on  the  basis  of  the  impression  produced  by  the 
samples  as  wholes,  and  not  upon  basis  of  legibility 
alone. 

TESTING  CONDITIONS 

Two  direct  problems  are  involved  in  measurement  of 
handwriting:  (1)  control  of  the  conditions  under  which 
the  samples  are  secured,  and  (2)  measurement  of  the 
rate  and  quality  of  the  resultant  writing. 

Writing,  as  a  motor  habit,  is  under  voluntary  control. 
That  is,  an  individual  may,  within  limits,  vary  the  rate 
and  quality  of  his  writing  at  will.  In  general,  the  more 
a  person  has  to  hurry,  the  less  care  he  will  be  able  to  give 
to  the  formation  of  his  letters,  and  vice  versa.  Hence, 
the  performance  of  an  individual  will  vary  as  the  con- 
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ditions  under  which  he  writes  vary.  It  becomes  im- 
portant, therefore,  to  choose  such  conditions  that  the 
samples  secured  may  be  of  such  a  character  that  in- 
ferences as  to  the  abilities  of  the  children  will  be  reliable. 
The  physical  factors  influencing  writing  may  be  dis- 
missed at  once.  At  Gary  all  tests  were  conducted  in 
regular  classrooms,  and  the  writing  was  with  pen  and  ink 
(pencil  was  used  in  the  lowest  grades  and  in  a  few  other 
cases  where  ink  was  not  available)  on  paper  of  good 
quality,  so  that  temperature,  humidity,  ventilation, 
materials,  etc.,  were  those  which  usually  prevail  in  school 
work.  The  main  factors  to  be  controlled  were,  therefore, 
two — incentive  and  subject  matter. 

METHODS  OF  SECURING  SAMPLES 

Teachers  of  writing  often  base  their  judgments  as  to 
children's  ability  upon  samples  secured  by  asking 
the  children  for  their  best  writing.  This  emphasis 
on  quality,  as  everyone  well  knows  from  his  own  ex- 
perience, leads  to  a  highly  specialized  performance 
quite  unlike  the  usual  writing  of  the  individual.  The 
purpose  of  this  survey,  however,  was  conceived  as  an 
effort  to  determine  real  ability,1  not  maximum  perform- 

1The  reader  should  remember  that  the  real  ability  of  an  individual 
is  his  median  performance  or  effective  ability.  That  test  is  to  be  judged 
the  most  perfect  test  of  handwriting  which  reveals  not  the  best  writing 
of  which  the  individual  is  capable,  nor  the  worst  which  he  will  do,  but 
the  quality  nearest  like  that  shown  by  his  penmanship  under  everyday 
conditions  in  which  the  writing  activity  is  functioning  normally.  The 
author  considers  it  of  utmost  importance  for  a  correct  understanding  of 
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ance.  Therefore,  quality  was  not  emphasized  in  se- 
curing samples. 

A  method  in  more  general  use  is  that  of  giving  the 
children  material  to  copy  or  write,  and  wording  the  in- 
structions in  such  a  way  that  the  children  understand 
they  are  free  to  determine  for  themselves  the  rate  and 
quality  of  their  writing.  A  test  of  this  character  is 
known  as  a  free  choice  test.  The  instructions  recom- 
mended by  Starch  are  "write  as  well  as  you  can  and  as 
rapidly  as  you  can."  In  the  Cleveland  Survey,  the 
teachers  were  told  that  papers  would  be  marked  for  both 
speed  and  quality.  The  free  choice  test  has  been  widely 
used.  The  instructions  at  Gary  were  made  to  conform  to 
the  Cleveland  model.1 

The  objection  to  the  free  choice  test  is  that  the  element 
of  choice  prevents  a  real  measure  of  the  efficiency  of  the 
teaching.  For  if,  as  Freeman  has  shown,2  certain  levels 
of  quality  and  certain  rates  of  work  are  required  for 


the  results  secured  from  the  tests  of  handwriting  in  this  survey  that  the 
distinction  between  performance  and  ability  be  clear.  The  point  of 
this  footnote  is  that  the  ability  of  an  individual  in  handwriting  is  not 
to  be  inferred  from  a  carefully  prepared  letter  of  application  for  a  posi- 
tion in  which  the  quality  of  writing  is  known  to  be  a  factor  determining 
employment,  nor  from  hastily  scribbled  notes  written  while  riding  on  a 
train,  but  from  the  kind  of  writing  most  often  appearing  in  the  daily 
work.  Test  conditions  should  be  such  that  the  samples  secured  show 
writing  of  this  type.    See  also  Chap.  VIII. 

1See  Judd,  C  H.,  Measuring  the  Work  of  Public  Schools,  Cleveland 
Survey. 

•Sec  Fourteenth  Yearbook  of  the  National  Society  for  the  Study  of 
Education,  page  72. 


So  THE  GARY  SCHOOLS 

business  life,  the  efficiency  of  school  training  should  be 
judged  by  the  percentage  of  the  total  product  which 
measures  up  to  the  standard.  In  a  free  choice  test  a 
child  capable  of  writing  at  the  required  rate  and  quality 
may  choose  to  write  at  a  much  higher  rate  and  with 
consequent  sacrifice  of  quality,  or  may  emphasize  quality 
at  the  expense  of  rate.  In  other  words,  for  measurement 
of  efficiency  the  children  should  write  at  the  standard 
rate  (since  rate  of  writing  may  be  controlled  through  the 
rate  at  which  material  is  dictated),  and  the  resultant 
writing  measured  for  quality.  If,  however,  the  material 
used  is  of  unusual  spelling  difficulty,  or  is  not  easily 
comprehended,  such  difficulties  may  invalidate  the  test 
as  a  measure  of  writing  ability.  The  material  dictated 
at  Gary  served  also  as  a  test  of  ability  to  spell  certain 
words.  The  value  of  the  test  as  a  writing  test  will  be 
discussed  later. 

A  fourth  method  of  securing  samples  which  represent 
children's  ordinary1  writing,  and  probably  the  best,  is 
to  use  material  written  for  another  purpose.  At  Gary, 
the  papers  written  in  the  composition  test,  where  quality 
of  writing  was  not  emphasized  were  used  also  as  samples 
of  the  children's  writing.    As  a  check  upon  these  results, 

*It  should  be  recognized,  of  course,  that  "ordinary"  is  here  used  to 
mean  the  kind  of  writing  which  the  children  have  been  in  the  custom 
of  using  for  their  written  work  in  the  composition  class.  It  may  or  may 
not  resemble  the  "ordinary"  writing  of  the  child  out  of  school  In  a 
class  where  the  teacher  of  English  composition  has  emphasized  quality  of 
writing  the  children  might  pay  more  attention  to  quality  than  in  a 
class  where  the  English  teacher  did  not  consider  writing  at  alL 
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in  certain  classes  reproductions  of  the  simple  story  used 
as  a  test  of  comprehension  in  reading  were  also  scored  for 
handwriting. 

The  second  factor  which  affects  the  quality  of  the  writ- 
ing is  the  material  written.  Obviously,  if  much  attention 
must  be  given  to  the  understanding  or  spelling  of  un- 
familiar words,  little  can  be  given  to  the  writing.  Most 
free  choice  tests  use  as  material  a  familiar  stanza,  as 
"Mary  had  a  little  lamb/'  which  is  written  again  and 
again.  In  Cleveland  the  material  for  all  grades  was 
the  first  three  sentences  of  Lincoln's  Gettysburg  speech. 
The  instructions  to  the  teachers  provided,  however,  that 
as  a  preliminary  preparation  the  pupils  should  "read  and 
copy  this  (material)  until  they  were  thoroughly  familiar 
with  it  and  practically  knew  it  by  heart."  The  same 
material1  was  used  in  the  Gary  test  and  the  same 
plan  of  preliminary  preparation  by  the  teacher2  was 
followed.  In  Cleveland,  the  tests  were  given  by  the 
teacher,  and  in  Gary,  by  specially  trained  examiners. 
No  attempt  was  made  in  either  Gary  or  Cleveland  to  find 
out  how  completely  the  teachers  had  availed  themselves 
of  the  opportunity  to  practice  on  the  test  material. 

In  the  dictation  exercises  the  words  used  to  test  spell- 
ing were  taken  from  Ayres'  thousand  commonest  words 
in  written  English.    As  far  as  possible,  no  test  word  was 

>Scc  page  487. 

"One  day's  preparation  was  provided  for  at  Gary;  at  Cleveland  the 
amount  is  not  specified,  but  may  have  been  as  much  as  a  week.  See: 
Measuring  the  Work  of  the  Public  Schools,  page  235. 
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used  which  was  not  of  less  spelling  difficulty  than  the 
other  words  in  the  sentences.  It  was  expected  that 
the  dictation  sentences  would  be  easy  material,  well 
within  the  powers  of  the  children,  but  owing  to  the 
limited  abilities  of  the  Gary  children  in  spelling,  it  is 
probable  that  the  material  was  too  difficult  to  afford  a 
true  measure  of  the  children's  writing  ability.  Therefore, 
the  results  from  the  dictation  tests  should  have  the  least 
weight  in  making  decisions  as  to  the  character  of  the 
product  of  writing  instruction  at  Gary. 

The  rate  of  dictation  was  based  upon  a  number  of 
determinations  by  Freeman,1  Courtis1  and  others,  of  the 
rate  at  which  children  write  when  writing  freely  (as  in 
reproducing  a  story).  In  other  words,  the  material 
was  dictated  at  the  rate  at  which  children  ordinarily 
write,  in  order  to  secure  samples  whose  quality  might 
correspond  to  the  quality  of  their  ordinary  writing.  This 
method  prevents  over  emphasis  on  quality.  It  forces 
some  children  to  write  at  what  is  for  them  an  abnormally 
high  rate.  The  purpose  of  the  test  is  not  to  secure  the 
best  writing  of  which  the  children  are  capable,  but  to 
determine  how  many  of  the  children  have  been  developed 
to  the  required  quality  level  at  the  given  rate  level.2  As 
the  results  of  the  free  choice  test  show,  the  rates  at  which 
the  material  was  dictated  were  almost  exactly  the  rates 

fourteenth  Yearbook,  Part  I,  National  Society  for  the  Study  of 
Education,  pages  56,  70,  76. 

The  method  of  this  test  did  not  function  at  Gary  because  there  was 
no  tendency  on  the  part  of  the  children  to  over  emphasize  quality. 
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of  writing  chosen  by  the  children  in  the  free  choice  test.1 
From  the  foregoing  discussion  it  will  be  seen  that  con- 
clusions as  to  writing  abilities  of  the  Gary  children  are 
based  upon  a  series  of  measurements  which  give  oppor- 
tunity for  significant  variations  in  performance. 

METHOD  OF  SCORING 

The  scores  for  rate  of  writing  in  the  free  choice  test 
were  determined  as  follows:  Immediately  upon  the 
completion  of  a  test  the  examiner  had  the  children  ex- 
change papers.  He  then  passed  each  child  a  score  card.2 
In  all  grades  the  children  filled  out  the  blanks  on  the 
card  and  then  in  grades  five  to  twelve,  by  the  aid  of  the 
count  of  the  letters  of  the  test  passages  printed  on  the 
back  of  the  card,  they  determined  the  number  of  letters 
that  had  been  written  in  two  minutes.  Later  this 
count  was  verified  by  the  examiners,  mistakes  noted 
and  counted,  and  the  rate  computed. 

It  was  assumed  that  the  letters  written  were  of  equal 
difficulty,  although  this  was  known  not  to  be  the  case. 
The  letter  "i"  is  much  easier  to  make  than  "g,"  for 
example.  However,  in  writing  one  hundred  words  the 
relative  frequency  of  the  different  letters  is  so  constant 
(Table  VII)  that  the  errors  due  to  differences  in  the 
difficulty  of  the  various  letters  are  negligible.  The  third 
and  fourth  grade  scores  may  be  in  error  by  a  small 

*In  grades  eleven  and  twelve  the  controlled  rate  was  approximately  ten 
letters  per  minute  lower  than  in  the  free  choice  test. 

•See  Appendix  B,  page  488. 
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amount  from  this  cause,  but  as  the  relative  difficulty 
of  the  letters  is  unknown,  it  is  not  possible  to  apply  a 
correction  for  this  factor. 

A  far  greater  source  of  error  in  determining  rate  of 
writing  is  found  in  the  difficulty  presented  to  the  children 
by  certain  words.  For  instance,  the  rate  of  writing  of 
the  third  grade  in  the  free  choice  test  was  but  five1  letters 
per  minute.  The  fourth  and  other  grade  rates  for  these 
same  tests  are  in  substantial  agreement  (fourth  grade 
42-44).  It  is  probable,  therefore,  that  some  factor  was 
operating  to  depress  the  third  grade  scores  in  the  free 
choice  test. 

Upon  examining  the  test  material  from  this  point  of 
view,  one  is  struck  by  the  difficulty  of  the  first  phrase: 
"Fourscore  and  seven  years  ago."  This  is  probably 
the  whole  cause  of  the  low  score.  The  material  was 
too  difficult  for  third  grade  children.  However,  no 
trace  of  similar  effects  is  observable  in  other  grades. 

In  the  dictation  tests,  no  measurement  of  rate  was 
necessary  as  the  tests  were  constructed  to  be  dictated 
at  a  given  rate.  The  formula  used  in  the  construction 
of  the  tests  was: 

T-nr  +  IL 

in  which  T  =  the  total  time  allowed,  n  =  the  number 
of  letters  to  be  written,  and  r  =  the  rate  in  seconds  per 
letter.  The  correction  T\  is  an  allowance  made  for 
the  time  needed  for  dictation,  the  rate  of  reading  being 

1See  Table  IV,  page  41. 
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ten  letters  per  second.    The  value  of  r  for  the  various 
grades  was  as  follows: 


Grades          2 

8 

4 

5 

6 

7 

8 

r       -  2.40 

1.87 

1.36 

1.05 

.88 

.75 

.64 

In  the  composition  and  reproduction  tests  the  children 
counted  the  number  of  words  written  and  recorded  their 
scores  on  their  papers.  These  scores  were  later  verified 
by  the  examiner  and  transferred  to  cards. 

For  the  composition  and  reproduction  tests  the  scores 
in  words  written  per  minute  were  converted  to  letters 
per  minute  by  determining  the  average  number  of  letters 
per  word  for  a  series  of  papers  in  each  grade.  After 
four  or  five  papers  the  results  are  constant.  In  the  sixth 
grade  ten  papers  chosen  at  random  were  used  (Table 
VIII,  page  57).  In  other  grades  tabulations  were  carried 
to  constant  results  only,  not  less  than  five  papers  being 
used  in  any  grade. 

The  average  values  finally  selected  for  converting  word 
per  minute  into  letters  per  minute  will  be  found  in 
Table  IX,  page  58. 

MEASUKF.MF.NT  OF  QUALITY 

The  Ayres  Handwriting  Scale  ("Three  Slant"  edition) 
was  employed  to  determine  quality.  This  scale  con- 
sists of  a  series  of  specimens  of  handwriting  ranging  in 
quality  from  very  bad  to  very  good.  The  samples  were 
chosen  on  a  basis  of  "  legibility ";  that  is,  careful  records 
were  made  of  the  time  required  to  read  a  large  number 
of  different  samples  and  certain  of  these  were  chosen 
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as  units  in  a  scale.    The  differences  in  legibility  from 
sample  to  sample  were  made  equal.1 

In  using  the  scale,  a  sample  of  writing  is  compared 
with  the  sample  of  the  scale  until  one  is  found  which 
corresponds  in  general  quality  with  the  sample  being 
measured.  The  number  of  the  scale  sample  is  then 
noted  and  the  specimen  being  measured  is  given  the  same 
value.  If  the  sample  being  measured  falls  between  two 
scale  samples,  it  is  given  a  value  to  correspond. 

RELIABILITY  OF  RESULTS 

In  order  to  make  the  reported  handwriting  results 
as  reliable  as  possible,  the  eighth  grade  papers  in  all 

TABLE  DC 

Average  Values  for  Changing  Rate  of  Writing  from  Words  Per 

Minute  to  Letters  Per  Minute 


GRADE 

composition 

REPRODUCTION 

TESTX 

Tssrn 

TESTin 

4 

5 

6 

7 

8 

9 

10 

11 

12 

3.4 
3.5 
3.6 
3.7 
3.9 
3.9 
3.9 
4.0 
4.0 

3.4 
3.4 
3.4 

4.0 
4.0 
4.0 
4.0 

3.4 
3.6 
3.5 
3.5 
3.8 

This  table  is  to  be  read  as  follows:  The  rate  of  writing  in  the  com- 
position test  in  the  fourth  grade  was  changed  to  letters  per  minute  by 
multiplying  the  number  of  words  written  per  minute  by  3.4.  The 
resulting  values  are  the  true  rates  within  approximately  1  per  cent 

}See  Bulletin  No.  113,  Division  of  Education,  Russell  Sage  Foundation. 
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the  writing  teste  were  scored  by  from  three  to  five  judges 
ind  the  median  scores  taken  as  the  real  quality  of  the 
samples.  The  scores  are  given  in  full  for  one  class 
[Table  X,  page  60  and  Figure  n,  page  62).  Although 
ndividual  judges  differ  widely  on  some  samples,  yet  the 
rcores  as  a  whole  show  close  agreement.  Of  the  170 
ndividual  ratings  on  the  34  papers  of  the  class,  73,  or 
13  per  cent.,  agree  exactly  with  the  median  scores,  57 
nore,  or  34  per  cent.,  fall  within  five  points,  or  half  a 
>tep  of  the  scale,  and  only  six,  or  3.5  per  cent.,  of  the 
judgments  differ  more  than  one  step  from  the  median 
ralue.  The  average  deviation  of  the  individual  judg- 
nents  from  the  medians  is  4.3  points,  slightly  less  than 
lalf  a  step  on  the  scale.  But  47  of  the  97  deviations 
irere  positive  and  50  negative,  so  that  the  median  score 
)f  the  class  as  determined  by  each  of  the  judges  alone  is 
Josely  the  same  as  the  median  class  score  determined 
rom  the  median  score  on  each  sample.  For  two  of  the 
udges  the  differences  are  zero.  Two  judges  differ  by 
me  point  only,  and  the  other  by  six  points.  If  the 
ictual  median  scores  are  used,  all  the  differences,  except 
me,  are  zero.  The  effect  of  combining  the  scores  of 
lie  different  judges  is  to  eliminate  the  wide  variations 
rhich  occur  with  each  judge  on  certain  samples. 

The  constancy  of  the  general  results  is  quite  remark- 
ible.  The  average  deviation  of  the  judges  in  1901 
udgments  of  the  papers  of  five  eighth  grade  classes  is 
I.9  points.  That  is,  a  single  judgment  will,  on  the  aver- 
ige,  differ  from  the  median  of  five  independent  judgments 


THE  GARY  SCHOOLS 


i2ssas8"sasassassas8s 


■ffTT-rrrnTTT 


++  ++++  +?  ++ 


?  +  "'1  T  ITT  f?+ 


3«0«l000 


+  %\1   7   +++T   +7 


SSSSSSS33SSS 


oooom-ooo  ©.iqo 


SSSS$S«SSSS2S 


I5000IC>000  Hj  IC 


SOOOOOIOU5IOOS* 


HANDWRITING 


OO'VOOOOOOOOOO 

e*eieot-'»'>»,ra*o>£>«>Tt-© 

oot-aooioifioqoiaQ 

— .  —  — icoRffl-- iMJinMMiB 

++T+      1T7+  7 

otoiooiooooooioioe 

U9OOOOIQU3OOOI0OO 

1       7M7        1 

ooneooooioeouiii 

'TT+     +7T'  + 

+    1++5  T  +7 

-T"-r*-Wif»SSSlOlQto3 

u:iou3ioioioioioqCiiowo 

*q»T»*u5»5*TlSl5r5to£&t-'G' 

us  <s  cj  ts  >a  sj  is  s>  *  us  a  ifl  <s 

eo -*  ■*  co  ■» -» -W  -TSiCiiaia  3 

CCOO"l'5qQ"ii'3«OiD 

SSSSSSSSSSSg^S 

aasasissa^sssss 

siifj 

Ufa* 

5.*!  H  a 

I  Mill 
Si  „  in 

ills]! 

!i<iJ| 


Is 


IN  1 


!«la"6| 
1*11  a s 

.S-S.S2!'3 
n  5      2  0  a 

jsol:j< 

miii 

lull 


FIGURE  XX 

Variations  in  Quality  Scores  Assigned  by  Means  or  Ayres 

Scale1 

HANDWRITING-VARIATIONS  IN  SCORING 

QUALITY- AYRES      DEVIATIONS  FROM  MEDIAN 
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Figure  ii— Continued 

at  the  left  hand  side  of  the  figure  represent  individual  papers. 
The  numbers  8,  17,  31,  34,  etc.,  refer  to  the  numbers  of  the  papers  in 
Table  X.  Letters  A,  B,  C,  D,  and  £  refer  to  the  different  judges. 
CL  indicates  values  for  class  as  a  whole.  Each  light  line  in  the  diagrams 
represents  the  value  assigned  the  sample  by  one  judge.  Each  heavy  line 
represents  the  value  adopted  as  the  true  value  of  the  sample. 

The  scale  along  the  top  of  the  figure  represents  quality  on  the  Ayres 
Scale.  The  diagrams  show  that  for  the  class  as  a  whole  the  class  score 
as  determined  from  the  scores  of  each  single  judge,  except  D,  agrees 
closely  with  the  value  as  determined  from  the  combined  scores  of  the 
five  independent  judges. 

Diagram  marked  8  represents  the  score  of  a  sample  upon  which 
there  was  close  agreement  between  the  five  judges.  Paper  1 7  represents 
the  scores  for  a  sample  in  which  Judge  D  showed  a  wide  variation. 
Papers  31  and  34  represent  samples  on  which  there  was  a  little  agree- 
ment between  the  different  judges. 

The  diagrams  on  the  right  hand  side  of  the  figure  show  the  distributions 
of  the  deviations  from  the  scores  adopted  as  the  true  value  of  the  samples. 
The  scale  at  the  top  shows  the  magnitude  and  quality  of  the  deviations. 
The  scale  at  the  right  of  each  distribution  shows  the  number  of  deviations 
of  a  given  type,  as  do  also  the  figures  written  in  the  diagrams  just  above 
the  base  line.    Diagram  for  Judge  A  is  to  be  read  as  follows: 

Out  of  the  34  papers,  Judge  A  gave  3  scores  which  were  10  points 
higher  than  the  proper  value  of  the  paper,  8  scores  which  were  5  points 
higher,  14  scores  which  agreed  exactly  with  the  true  value,  8  scores 
which  were  5  points  lower  than  the  true  value,  and  1  which  was  15  points 
lower. 

Note  that  Judges  C  and  E  have  a  tendency  to  score  papers  too  low 
while  Judge  A  has  a  tendency  to  score  papers  too  high.  Note  also  that 
in  the  distributions  of  the  deviations  for  the  class  as  a  whole  the  dis- 
tribution is  symmetrical  about  the  zero  point,  that  130  out  of  170  devia- 
tions are  not  greater  than  5  points. 
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less  than  half  a  unit  of  the  scale.  Of  1901  judgments, 
931,  or  49  per  cent.,  agreed  exactly  with  the  median 
value  for  the  sample.  Of  the  remaining  judgments, 
497  were  positive  and  473  were  negative.  All  the  evi- 
dences show,  therefore,  that  the  judges  used  the  scale 
in  a  consistent  manner,  and  that  a  class  score,  even  when 
determined  by  a  single  judge,  would  not  be  in  error 
more  than  a  small  part  of  one  division  of  the  scale. 

The  quality  of  the  writing  of  grades  other  than  the 
eighth  was  determined  by  the  scoring  of  one  judge.  In 
twenty  two  classes,  however,  mainly  of  the  fourth  and 
sixth  grades,  the  papers  were  scored  by  two  judges. 
The  median  difference  in  the  median  scores  for  the 
various  classes  as  determined  by  the  two  judges  inde- 
pendently is  again  a  very  small  part  of  one  step  of  the 
scale.  These  figures  make  it  possible  to  say  that  the 
scores  of  the  various  classes  may  be  depended  upon  to 
represent  the  quality  of  "writing"  actually  found  in 
the  papers  within  a  third  of  one  division  of  the  scale. 

STANDARDS  OF  JUDGMENT 

To  give  the  reported  results  real  objective  validity, 
the  Gary  judges  scored  a  set  of  "standard  samples." 
Part  of  these  were  taken  from  the  Thorndike  Scale,  and 
the  rest  are  the  samples  published  by  Thorndike  for  the 
purposes  of  teacher  training,  known  as  Supplement  A.1 
The  values  assigned  these  samples  by  the  Gary  judges  are 

'Teachers  College  Record,  November,  1914. 
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Table  XII— Continued 

This  table  is  to  be  read  as  follows:  Sample  221  was  one  of  those  rated 
60  by  the  Gary  judges  and  was  given  a  different  rating  by  the  Ayres 
judges.  Sample  23  was  rated  60  by  the  Ayres  judges  and  was  given  a 
different  rating  by  the  Gary  judge.  Sample  106  was  rated  60  by  both 
sets  of  judges.  The  Thorndike  value  of  sample  221  was  10.2,  of  sample 
23  was  n.o,  of  sample  106  was  n.o,  etc.  The  nine  samples  rated  60 
Ayres  by  the  Gary  judges  ranged  in  value  according  to  Thorndike  from 
10.2  to  13.2.  The  median  score  was  12.0.  Therefore,  according  to  the 
Gary  judges,  60  Ayres  is,  in  general,  equivalent  to  12.0  Thorndike 
(A.  D.  ±  .7).  Similarly,  according  to  the  Ayres  judges,  60  Ayres  is,  in 
general,  equivalent  to  1 2.1  Thorndike  (A.  D.  ±  .6). 

In  similar  fashion,  in  the  second  part  of  the  table,  all  samples  hav- 
ing the  same  Thorndike  value  (12.35)  are  grouped  together  and  the 
average  Ayres  value  found.  According  to  the  Gary  judges  this  value  is 
62.5  ±  44,  according  to  the  Ayres  judges  it  is  64  ±  4.8.  Each  of  the 
values  in  Table  XI  was  determined  in  similar  fashion. 


given  in  Table  XI,  pages  64  and  65.  Any  person  desiring 
to  compare  his  standards  with  those  of  the  groups  of 
judges  in  this  survey  need  but  measure  these  standard 
samples  on  the  Ayres  Scale  and  compare  his  results  with 
those  given  in  the  table. 

The  values  of  the  standard  samples  on  the  Thorndike 
Scale  were  reported  by  Thorndike  at  the  time  the  sam- 
ples were  published  and  are  also  given  in  Table  XII,  page 
67.  To  check  the  Gary  scoring,  the  samples  were  sent  to 
Dr.  Ayres  with  the  request  that  they  be  scored  by  com- 
petent judges  in  the  division'  of  education  of  the  Russell 
Sage  Foundation,  of  which  he  is  director.  The  median 
scores  of  five  judgments  on  each  sample  as  reported  by 
him  are  also  to  be  found  in  the  table.    These  will  be 


% 
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considered  as  establishing  the  true  Ayres  value  of  the 
standard  samples. 

The  individual  judgments  of  each  of  the  five  Gary 
judges  were  checked  against  the  Ayres  standards.  In- 
dividual variations  were  apparent,  but  the  average  score 
of  the  two  judges  who  scored  90  per  cent,  of  all  the 
papers  in  this  survey  was  found  to  agree  almost  exactly 
with  the  Ayres  standards.  These  judges  are  C  and  D 
in  Table  X.  The  median  score  for  each  judge  on  each 
class  for  each  case  of  multiple  scoring  was  next  deter- 
mined, and  in  no  case  found  to  differ  from  the  average 
score  of  these  two  judges  more  than  six  points  higher  or 
lower.1  Further,  the  few  classes  not  scored  by  them 
were  compared,  grade  for  grade,  with  the  scores  of  the 
same  classes  in  other  tests  which  had  been  so  scored. 
In  only  five  cases  was  the  variation  more  than  five  points, 
or  half  a  step,2  including  all  differences  caused  by  ab- 
sence, difference  in  testing  conditions,  etc.  The  Gary 
results,  therefore,  nowhere  depart  from  the  official 
Ayres  standards  more  than  half  a  step. 

AGREEMENT  BETWEEN  TESTS 

A  by-product  of  repeated  testing  is  the  check  the  results 
yield  upon  the  tests  themselves.  Questions  as  to  the 
degree  of  correspondence  from  test  to  test,  both  of  class 
and  individual  scores,  will  now  be  considered. 

The  individual  scores  of  42  eighth  grade  children 

1The  median  of  20  cases  was  1.3;  7  cases  higher  and  13  lower. 
The  median  of  20  cases  was  3.0. 
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present  for  all  tests  used  in  the  survey  were  compared 
as  to  scores  for  quality  of  writing  in  the  three  writing 
tests.  In  quality,  40  per  cent,  of  the  individuals  main- 
tain the  same  position  within  the  group1  when  measured 
by  the  dictation  tests  as  when  measured  by  the  com- 
position test.  For  comparisons  between  the  free  choice 
test  and  dictation  tests,  the  index  of  correspondence 
is  the  same,  while  the  result  was  86  per  cent,  for  compari- 
sons between  the  composition  and  free  choice  tests. 
Only  one  comparison  for  rate  was  possible,  since  in  the 
dictation  test  the  rate  of  writing  was  the  same  for  all. 
But  29  per  cent,  of  the  children  were  found  to  maintain 
the  same  position  in  the  group  on  the  basis  of  comparison 
of  scores  in  rate  of  writing  in  the  composition  and  free 
choice  tests  (Table  XIII,  page  71).  In  other  words,  for 
the  Gary  children  the  free  choice  test  yields  samples  of 
writing  which  agree  closely  with  the  quality  of  writing  the 
children  show  in  their  compositions. 

For  the  various  class  scores  the  actual  differences  in 
quality  between  the  class  result  and  the  generalized 
city  wide2  results  were  found.  In  some  classes  the  scores 
made  in  the  composition  test  agree  with  those  made  in 
the  free  choice  and  dictation  tests.  In  other  classes 
there  is  a  wide  divergence  between  the  three  scores. 
For  instance,  class  No.  28  Froebel,  4A  grade,  was,  in 
the  composition  test,  2.3  points  above  the  city  wide 
score,  the  free  choice  4.8  points  above  and  in  tha» 

,See  XI  of  Appendix  A,  page  480. 
•See  XI  of  Appendix  A,  page  473. 
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Table  13 — Continued 

This  table  is  to  be  read  as  follows:  If  the  relations  of  the  individual 
scores  for  handwriting  to  the  class  median  in  the  dictation  test  are  com- 
pared with  the  relations  of  the  score  in  handwriting  of  the  same  individ- 
uals to  the  class  median  in  the  composition  test,  it  will  be  found  that 
40%  of  the  individuals  maintain  the  same  positions  in  the  two  tests 
within  one  unit  of  variability;  that  is,  within  2.5  Ayres  units  in  the  dic- 
tation test  and  5  Ayres  units  in  the  composition  test 

Note  that  the  coefficient  of  correspondence  between  the  free  choice 
test  and  the  composition  test  is  86%.  In  other  words,  the  free  choice 
test  reveals  the  quality  of  the  ordinary  writing  of  more  than  five  sixths 
of  the  children,  and,  for  Gary  at  least,  is  a  good  test 

dictation  test  2.5  points  above.  The  three  tests  thus 
confirm  each  other  in  showing  that  the  writing  done 
by  this  class  is  slightly  above  [the  grade  level  for  the 
city  as  a  whole.  On  the  other  hand,  class  No.  10 
Beveridge,  5 A  grade,  is  in  the  composition  test  3.0  points 
below  the  general  level,  and  in  the  free  choice  test  7.5 
above  the  general  level,  and  in  the  dictation  test  2.0 
points  above.  In  this  case  the  three  tests  give  diver- 
gent results. 

If  the  57  class  differences  in  quality  of  writing  in  the 
composition  test  are  compared  with  the  corresponding 
differences  in  the  free  choice  test  it  will  be  found  that  in 
29  cases  the  class  scores  are  either  both  above  or  both 
below  the  city  wide  scores;  in  28  cases  there  is  disagree- 
ment, one  of  the  two  scores  being  above  and  the  other 
below  the  city  wide  scores.  The  comparison  of  the  results 
from  the  dictation  and  free  choice  test  gives  almost  ex- 
actly the  same  results,  although  the  actual  classes  in 
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FlGURJf    II 

Cuss  Dotekences  *»ou  the  Median  m  the  Composition 
and  Fbee  Choice  Tests 


The  heavy  line  through  the  center  of  the  figure  represents  the  city  wide 
median  scores  for  both  tests.  Distances  above  the  lines  represent  the 
amount  which  class  scores  are  above  the  median;  distances  below  the 
line  represent  the  amounts  which  class  scores  are  below  the  median. 
The  magnitudes  of  deviations  are  shown  by  the  scales  along  the  left  hand 
vertical  axis.  The  extreme  variations  are  marked  with  letters  to  show 
the  name  of  the  school.  B  means  Beveridge,  F — Froebel,  E — Emer- 
son, and  J — Jefferson.  Solid  line  represents  the  results  of  the  compo- 
sition test  (c),  dotted  line  represents  the  free  choice  test  (F.C.). 

The  reader  should  note  that  in  some  cases  the  results  of  the  different 
tests  are  in  close  agreement;  in  other  cases  the  two  give  very  different 
results.  For  instance,  class  No.  t  is  shown  as  approximately  5  points 
below  the  city  wide  median  by  both  tests.  On  the  other  hand,  class  No. 
54  is  shown  nearly  10  points  above  the  median  by  the  free  choice  test, 
and  nearly  10  points  below  the  median  by  the  composition  test. 

In  the  main,  the  curves  show  that  the  two  tests  agree  in  all  grades 
within  s  points. 
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which  disagreement  is  found  are  not  the  same  as  for 
the  previous  comparisons.  Comparison  of  the  results 
of  the  writing  in  the  composition  test  with  that  in 
the  dictation  test  reveals  a  little  closer  correspondence. 
Even  here,  however,  there  are  34  cases  of  agreement  and 
23  of  disagreement  (Table  XTV,  page  73  and  Figure  12, 

page  74)- 

On  the  other  hand,  it  should  be  noted  that  for  the 
whole  57  cases,  the  amount  of  divergence  from  the  city 
wide  scores  is  in  most  instances  less  than  5  points. 
In  only  6  does  the  extreme  divergence  exceed  10  points. 
In  other  words,  the  three  tests  yield  results  which  are 
in  close  agreement  and  the  differences  are  relatively 
insignificant  when  it  is  remembered  that  the  average 
difference  of  judgment  in  using  the  scale  amounts  to  from 
3  to  5  points.  The  rest  of  the  difference  is  probably 
due  to  marked  differences  in  training.  For  example, 
in  the  graph  it  will  be  seen  that  in  the  lower  grades 
the  extremely  high  scores  are  often  those  of  classes 
from  the  Beveridge  school.  The  training  in  composi- 
tion work  in  this  school1  is  shown  to  be  much  below 
that  of  the  city  generally.  Consequently  the  difference 
in  the  quality  of  writing  in  these  tests  is  in  some  way 
probably  related  to  the  differences  in  training  in  English 
composition. 

From  the  table  as  a  whole,  therefore,  it  is  possible  to 
draw  the  conclusion  that  there  is  no  general  relationship 
between  the  three  types  of  scores.    For  the  Gary  chil- 

1See  Chapter  VI,  page  234. 
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dren,  one  of  these  tests  is  as  suitable  a  measure  of  the 
relative  quality  of  handwriting  as  either  of  the  others. 
Whether  or  not  this  would  be  true  in  other  schools  in 
which  children  are  given  a  different  type  of  training  is, 
of  course,  another  question,  and  one  that  cannot  be 
settled  on  the  basis  of  the  present  data. 

RANGE  OF  INDIVIDUAL  ABILITY 

One  other  point  needs  to  be  considered,  the  range 
of  ability  within  the  class.  The  distributions  of  the 
grade  scores  for  quality  of  writing  were  found  to- 
gether with  the  standard  deviations  for  certain  grades, 
and  the  coefficient  of  variability,  based  upon  the  same 
(Table  XV). 

The  range  of  variation  in  grades  four,  six,  and  eight 
proved  to  be  much  less  than  in  the  other  grades.  But 
these  are  precisely  the  grades  in  which  the  papers  were 
scored  by  more  than  one  judge.  In  other  words,  the 
effect  of  multiple  scoring  was  to  reduce  the  apparent 
variability.  This  effect  must  be  taken  into  consideration 
when  making  comparisons  with  results  from  other  school 
systems.  For  instance,  in  Rockford,  111.,  the  coefficient  of 
variability  for  grades  five  to  eight  is  17  per  cent.,  which  is 
much  lower  than  for  the  Gary  eighth  grade  results  (24  per 
cent.),  but  no  details  are  given  as  to  the  number  of 
judgments  upon  which  each  score  was  based,  or  amount 
of  variation  in  the  standards  of  the  judges,  and  the 
like. 

Therefore,  in  making  judgments  in  this  survey,  one 
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should  be  governed  rather  by  the  samples  of  writing 
shown  and  by  the  comparisons  from  test  to  test  and 
from  grade  to  grade  than  by  comparisons  from  city 
to  city,  except  where  tests  are  used  to  show  general 
relations. 


Ik 


IV.    SPELLING 

§i.    General  Results 

SPELLING,  like  handwriting,  is  an  ability  which  is 
considered  easily  measurable,  and  there  are  sev- 
eral scales  and  tests  available  for  the  careful  evalu- 
ation of  the  results  of  teaching  effort.  Moreover,  at 
Gary  the  annual  time  allotment  for  spelling  is  496  hours 
as  compared  with  482  hours,  the  average  time  allotment 
of  fifty  American  cities. 


TESTS  USED 

Three  methods  of  testing  spelling  were  used  at  Gary. 
Conventional  list  tests  were  given  to  measure  the  con- 
ventional school  product  of  the  teaching  of  spelling. 
Next,  timed  dictation  tests  were  used  in  an  attempt  to 
control  the  rate  of  writing,  to  prevent  deliberation  and 
to  insure  both  automatic  spelling  and  attention  to  ex- 
pression of  thought.  Finally,  the  papers  written  in  the 
composition  test  were  scored  for  errors  in  spelling.  The 
conclusions  as  to  spelling  ability  are  thus  based  upon 
three  very  different  types  of  results. 
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LIST  TESTS 

The  words  used  in  the  list  test  were  the  same  as  those 
>f  the  Cleveland  Survey  (Figure  13).  In  the  figure  is 
ihown  also  the  division  of  the  Ayres  Spelling  Scale  in 
ffhich  the  words  occur,  and  the  standards  of  accuracy 
:or  the  different  grades  as  determined  by  Ayres*  investi- 
gations. The  words  are  so  chosen  that  the  accuracy  of 
spelling  should  be  the  same  from  grade  to  grade;  that  is, 
the  fourth  grade  children  should,  according  to  Ayres,  be 
ible  to  spell  the  words  by  which  they  are  tested  with  the 
>ame  degree  of  accuracy  (76  per  cent.)  as  the  fifth  grade 
iildren  are  able  to  spell  the  words  in  the  fifth  grade  test 
76  per  cent.).  In  other  words,  the  increase  in  difficulty 
n  the  spelling  tests  is  supposed  to  keep  pace  exactly 
rith  the  increase  in  spelling  ability.  The  average1  ac- 
mracy  from  grade  to  grade  is  thus  constant. 

In  the  Cleveland  Survey  there  were  no  spelling  tests  in 
grades  higher  than  the  eighth.  At  Gary,  however,  the 
jghth  grade  words  were  given  also  to  grades  nine,  ten, 
deven,  and  twelve.  It  must  be  particularly  noted  that 
:his  repetition  of  the  same  words  through  several  grades 
institutes  an  entire  change  of  method.  The  purpose 
)f  the  change  was  to  determine  how  rapidly  the  ability  to 
spell  the  eighth  grade  words  was  developed  by  high  school 
rork. 


Tlie  term  "average"  is  used  throughout  this  report  in  its  popular 
tense — to  indicate  the  arithmetical  mean,  or  the  sum  of  all  the  scores 
livided  by  their  number. 
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The  list  tests  were  given  by  the  room  teacher  in  the 
presence  of  the  examiner.  No  suggestions  were  made  as 
to  how  the  test  should  be  conducted,  other  than  to  ask 
that  it  be  given  in  the  "usual  manner."  Accordingly 
some  teachers  dictated  the  words  slowly,  some  rapidly. 

TABLE  XVI 
Average  Accuracy  in  List  Words 


GARY  RESULTS 

AYRES  STANDARD 

GRADE 

ACTUAL 

GRADE 

AVERAGE 

GENERALIZED 
VALUES 

AVERAGE  OF 
84  CITIES 

ACTUAL 
DIFFERENCES 

CLEVELAND 
RESULTS 

2 
8 
4 
6 
6 
7 
8 

51 
56 
53 
51 
58 
62 
53 

51 
54 
54 
54 
56 
57 
57 

77 
77 
76 
76 
76 
76 
76 

—26 
—21 
—23 
—25 
—18 
—14 
-23 

74 
78 

73 
75 
78 
76 
80 

AVERAGE 

54.9 

55 

76 

—21 

76 

Scores  of  High  School  Classes  in  Spelling  the  Words  or  the 

Eighth  Grade  Test 


9 

10 

11 

12 

Actual  Score 

57 
SO 

71 
69 

79 

77 

80 

Generalized  Score1 

80 

*Sce  Appendix  A,  page  474. 

This  table  is  to  be  read  as  follows:  The  average  accuracy  of 
spelling  of  the  test  words  for  the  second  grade  by  the  second  grade 
children  at  Gary  was  51%.  Ayres'  standard  score  for  these  same 
words  was  77%.  The  Gary  second  grade  record,  therefore,  is  26% 
below  the  standard.  The  average  score  made  by  Cleveland  second 
grade  classes  in  spelling  the  same  words  was  74%. 
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Some  read  the  words  twice  and  gave  explanations  of 
meaning,  or  illustrations  of  use  that  were  helpful,  others 
did  nothing  but  read  the  list  of  words.  That  is,  the 
conditions  under  which  the  tests  were  given  varied  from 
room  to  room. 

On  the  average,  the  Gary  children  spell  the  test  words 
with  an  accuracy  of  approximately  55  per  cent.  (Table 
XVI,  page  82).  The  scores  made  by  the  high  school 
classes  on  the  eighth  grade  words  show  a  gradual  im- 
provement. The  record  of  the  eighth  grade  class  on  the 
eighth  grade  words  is  53  per  cent.,  but  by  the  end  of  the 
twelfth  grade  the  same  words  are  spelled  with  80  per  cent 
-accuracy. 

COMPARATIVE   DATA1 

For  the  list  tests  used  at  Gary,  general  standards  are 
available,  based  upon  tests  in  eighty  four  American  cities 
as  well  as  the  results  obtained  in  the  Cleveland  Survey 
where  precisely  the  same  tests  were  used.  For  instance, 
Ayres*  standard  for  the  eighth  grade  words  is  76  per  cent., 
the  eighth  grade  score  in  Cleveland  was  80  per  cent., 
the  Gary  score  was  53  per  cent.  (Table  XVI).  The  Gary 
averages  are  uniformly  about  20  per  cent,  below  Ayres* 
standards.  That  is,  the  Gary  scores  parallel  the  Ayres 
standard,  but  at  a  much  lower  level  (Figure  14,  page  84). 
This  result  in  connection  with  the  fact  previously  brought 
out  in  regard  to  the  scores  made  by  the  high  school 
classes  apparently  means  that  such  of  the  Gary  children 

"Sec  footnote,  page  38. 
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Figure  14 


Gary  Scores  in  the  Cleveland  List  Spelling  Test  Compared 

with  Ayres'  Standards 

904 


75 


to 


45 


50 


15 


AYK.E3  -STANDARD 


6  7  8 

GRADES 


10 


II 


12 


The  scale  along  the  base  of  the  figure  represents  grades.  The  scale  at 
the  left  of  the  figure  shows  average  per  cent,  of  accuracy  of  spelling.  The 
solid  line  represents  Gary  scores  (generalized).  The  dotted  line  rep- 
resents actual  grade  averages  showing  variation  from  grade  to  grade. 
The  light  solid  line  represents  Ayres'  standards  based  upon  results 
secured  in  eighty-four  American  cities.  The  portion  of  the  curve  to  the 
right  of  the  vertical  line  represents  results  in  the  high  school  grades  in 
which  the  same  eighth  grade  words  were  repeated  from  grade  to  grade. 

The  Gary  curve  parallels  the  curve  for  Ayres'  standards,  but  at  a 
much  lower  level.  The  eleventh  grade  in  the  high  school  is  the  first  Gary 
grade  to  spell  the  eighth  grade  test  words  as  well  as  the  eighth  grade  chil- 
dren in  the  average  conventional  school. 

In  the  graph  as  constructed  the  increase  in  difficulty  of  words  from  grade 
to  grade  is  not  shown.  (Compare  Diagram  21,  Measuring  the  Work  of 
the  Public  Schools,  Cleveland  Survey.) 
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Figure  15 

Eighth  Grade  Class  Scores  at  Cleveland  and  Gary  in 
Spelling  the  Same  Twenty  Words 


Average  Accuracy 
Gary  $3 

Cleveland     80 


0 


10      20     30     40     *50     60      70     80     90 
AVERAGE   PER  CENT  Of  ACCURACY 
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100 


Each  rectangle  represents  the  score  of  one  class.  The  position  of  the 
rectangle  over  the  scale  along  the  base  of  the  figure  shows  the  avenge 
accuracy  of  spelling  of  that  class.  The  Gary  classes  are  shown  in  solid 
black.  Four  of  the  five  Gary  classes  are  lower  than  any  class  in  Cleve- 
land. 

as  remain  in  school  eventually  learn  to  spell  the  common 
words  used  in  the  tests  as  well  as  they  are  spelled  by  the 
eighth  grade  children  in  the  average  conventional  school, 
but  in  time  about  three  years  later.  Out  of  90  eighth 
grade  classes  in  Cleveland,  only  15  have  scores  as  low  as 
the  highest  eighth  grade  class  at  Gary,  while  4  out  of  5 
eighth  grade  classes  at  Gary  are  lower  than  any  eighth 
grade  class  in  Cleveland  (Table  XVII,  Figure  15). 

An  inspection  of  the  individual  scores  of  the  eighth 
grade  children  reveals  the  fact  that  about  51  per  cent, 
of  these  children  misspelled  more  than  half  of  the  twenty 
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test  words.  As  measured  by  such  list  tests,  therefore, 
the  Gary  children  would  seem  to  have  little  ability  to 
spell  the  words  shown  in  Figure  13. 

DICTATION  SPELLING  TESTS 

Dictation  spelling  tests  were  used  to  measure  the  de- 
velopment of  spelling  ability.  Exactly  the  same  tests 
were  given  in  several  successive  grades.  The  improve- 
ment in  ability  from  grade  to  grade,  however,  makes  it 
impractical  to  continue  with  the  same  words  throughout 
all  the  grades.  For  instance,  by  the  fourth  grade  the 
degree  of  accuracy  on  second  grade  words  reaches  such  a 
high  level  (94  per  cent.)  that  the  words  are  no  longer 
an  adequate  test.  To  meet  this  situation,  changes  were 
made  in  certain  grades  from  easy  to  more  difficult 
material.  In  these  grades  the  children  took  both  the 
easy  and  the  difficult  test.  The  continuity  of  develop- 
ment in  spelling  ability,  as  revealed  by  the  series  of 
tests  as  a  whole,  is  thus  unbroken. 

A  second  difference  between  the  list  and  dictation  tests 
should  be  noted.  In  the  dictation  test  the  words  were 
given  in  sentences  and  a  definite  time  allowed  for  the 
writing  of  each  sentence.  Thus  the  words  "every" 
and  "race"  in  the  first  test  were  dictated  to  the  children 
in  the  sentence,  "Every  boy  likes  to  see  a  race,"  and 
fourth  grade  children  were  allowed  thirty  two  seconds 
in  which  to  write  it.  The  time  allowances  were  changed 
from  grade  to  grade  to  correspond  to  the  increasing  ma- 
turity of  the  children. 
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Dictation  Spelling  Tests  and  Scheme  fob  Grvr 
Grades 


MADE, 

• 

1 

* 

i 

6 

'         « 

- 

— 

. 

Test  1 
Test  2 
Test  3 
Test  4 

66% 

84% 

94% 
66% 

79% 

88% 
66% 

79% 

88% 
66% 

„ 

"No  ilandudi  fivcu. 

Woeds  Used  jj 


Dictation  Sentences 


T=*. 

»r. 

>»l 

mt 

1 

forget 

beg 

victim 

judgment* 

2 

blue 

emergency* 

3 

importance 

athletic* 

senate 

organization* 

5 

eight 

agreement 

complaints 

entitle 

committee 

7 

government 

separate 

responsible 

9 

Wednesday 

especially 

10 

dark 

pleasant 

recommend' 

11 

majority 

preliminary* 

flight 

organize 

decision* 

13 

minute 

allege 

14 

century 

principle 

15 

glad 

16 
17 

age 

suggest 

February 

still 

cordially 

19 

20 

business 

disappoint 

aykes 

set: 

J 

O-P-Q 

S-T-TJ 

W-X-Y-Z 

'Ayrei'  standards  ak  based  on  wards  dictated  in  lilts,  untimed.  and  in  rrrrriM] 
high  by  s  percent,  lor  words  in  limed  sentenca.  This  li  probably  offset  by  lit  tta 
the  tan  weie  given  in  May,  while  the  standard  values  an  for  tests  given  at  the  midt 
thEycu.    (Sec  Judd:  MeajuriiiK  Hie  Work  of  the  Public  Schools,  p»ee  m.) 
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The  complete  scheme  for  the  dictation  tests  and  the 
actual  words  used  therein  are  shown  in  Figure  16.  It 
should  be  noted  that  the  dictation  tests  were  made  to 
supplement  the  list  test,  both  in  the  character  of  the  spell- 
ing product  tested  and  in  the  levels  of  ability  measured. 

In  general,  the  results  from  the  dictation  test  (Table 
XVIQ,  page  90)  fully  confirm  those  from  the  list  tests. 
The  eighth  grade  scores  on  the  easy  words  for  the  grade 
was  69  per  cent.,  on  the  difficult  words,  50  per  cent.  In 
grades  two  to  four  the  total  growth  shown  in  the  two 
year  interval  was  41  per  cent,  (second  grade  42  per  cent., 
fourth  grade  83  per  cent.).  For  grades  four  to  six  the 
growth  was  but  34  per  cent.,  from  grades  six  to  eight 
20  per  cent.,  from  grades  eight  to  twelve  33  per  cent. 
In  other  words,  the  results  show  that  the  growth  is 
small  from  grade  to  grade  and  relatively  decreases  as  the 
difficulty  of  the  words  increases.  This  fact  is  shown 
graphically  (Figure  17,  page  91)  by  the  change  in  the 
slant  of  the  development  curves  in  the  successive  grades. 

COMPARATIVE  DATA1 

The  Gary  results  are  consistently  lower  than  the 
Ayres  general  standards.  The  magnitude  of  the  differ- 
ence is  approximately  the  same  as  for  the  list  tests  (20 
per  cent.  Table  XVIII,  Figure  17).  The  Gary  scores 
are  lower  also  than  those  resulting  from  the  use  in  De- 
troit of  exactly  the  same  tests.  Measurement  by  dic- 
tation tests  thus  confirms  the  conclusion  previously 

'See  page  38. 
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Figure  17 
Results  op  Dictation  Tests 
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Scale  along  the  base  of  the  figure  represents  grades.  Scale  along  the 
xtical  axis  represents  per  cent,  of  accuracy  in  dictation  tests.  Solid 
tes  represent  Gary  results  in  Test  1  (grades  two,  three,  and  four), 
st  2  (grades  four,  five,  and  six).  Test  3  (grades  six,  seven,  and  eight) 
d  Test  4  (grades  eight,  nine,  ten,  eleven,  and  twelve).  Broken  line 
presents  Ayres'  standards  for  the  words  used  in  the  dictation  tests. 
grades  eight,  nine,  ten,  eleven,  and  twelve  the  broken  line  does  not 
present  the  Ayres'  standards,  but  the  scores  made  by  the  same  grades 
the  list  tests  given  previously.  It  should  be  noted  that  in  the  fourth, 
th,  and  eighth  grades,  where  it  was  necessary  to  change  from  easy  to 
tficult  words,  both  tests  were  given. 

Conclusions:  The  Gary  scores  are  shown  to  be  below  Ayres'  stand- 
is.  As  the  difficulty  of  the  words  increases,  the  slant  of  the  devel- 
ment  curves  from  lower  to  high  grades  decreases.  In  the  foirth 
ide,  the  Gary  results  00  the  easy  test  exceed  those  of  the  Ayres  on  the 
id  words.    In  the  sixth  grade  the  Gary  results  on  the  easy  words  are 
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Figure  17 — Continued 

just  equal  to  Ayres'  standards  on  the  hard  words.  In  the  eighth  grade 
the  Gary  results  on  the  easy  words  are  very  slightly  above  Ayres'  stan- 
dard for  the  difficult  words.  The  four  tests  are,  therefore,  consistent  in 
showing  that  the  Gary  children  are  from  one  to  two  years  behind  children 
in  conventional  schools  in  their  ability  to  spell  common  words. 

reached  in  regard  to  the  lack  of  development  of  spelling 
ability  in  the  Gary  children. 

COMPOSITION  SPELLING  TESTS 

As  a  check  upon  the  formal  spelling  tests,  misspellings 
in  papers  written  in  the  composition  test  were  tabulated. 
The  errors  noted  were  of  two  sorts:  slips,  or  trivial  mis- 
takes, such  as  the  omission  of  "d"  from  the  word  "and," 
and  more  serious  misspellings,  such  as  "peise"  (piece). 
The  number  of  words  misspelled  per  thousand  running 
words  was  called  a  spelling  coefficient.    Thus  in  the 
eighth  grade  the  total  number  of  running  words  in  122 
papers  was  27,610;  the  total  number  of  all  misspellings 
was  720.    Of  this  number,  however,   140  were  slips, 
leaving  580  words  misspelled  according  to  the  rules 
adopted  for  the  scoring.    The  spelling  coefficient  for 
total  misspellings  was  26.07  (720  -*-  27.61)  and  for  the 
slips  5.07  (140  -s-  27.61).    In  other  words,  the  general 
accuracy  of  the  eighth  grade  spelling  in  the  composition 
test  was  97  per  cent.  (1.00  — .026)  if  all  the  mistakes 
were  counted,  or  98  per  cent.  (1.00  —  .02 1)1  if  slips  were 
not  considered  misspellings.    Thus  the  results  from  the 

,(.026 .005   =    .02l). 
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composition  test  seem  directly  to  contradict  the  results 
of  the  formal  tests  in  spelling. 

The  full  degree  of  this  apparent  contradiction  is  shown 
by  the  actual  and  generalized  city  wide  median  coeffi- 
cients for  both  slips  and  misspellings  in  each  grade  (Table 
XIX  below,  Figure  18,  page  94.).  The  general  accuracy 
of  spelling  is  high — 92  per  cent,  in  the  fourth  grade,1 
the  lowest  tested — and  the  improvement  is  marked  from 
grade  to  grade.    The  total  errors  in  the  twelfth  grade 

TABLE  XLX 

Spelling  Coefficients  in  Composition  Tests 

Coefficient  found  by  dividing  number  of  words  misspelled  by  number 
of  words  written.  Represents  number  of  words  misspelled  per  thousand 
words  written.  S  stands  for  slips;  M  for  misspellings;  T  for  total  errors. 
S  and  T  are  tabulated.    M  found  by  subtraction. 


GRADE 

ACTUAL  CITY  WIDE  MEDIANS 

GENERALIZED  SCORES 

8 

K 

T 

s 

K 

T 

4 

5 

6 

7 

8 

9 

10 

11 

12 

16.8 

13.3 

10.6 

7.5 

6.1 

3:2 

3.4 
.84 
3.0 

57.0 

52.6 

43.1 

24.3 

13.6 

13.9 

9.8 

8.56 

6.0 

73.8 
65.9 
53.7 
31.8 
19.7 
17.1 
13.2 
9.40 
9.0 

17.0 
14.0 
10.0 
8.0 
6.0 
4.0 
4.0 
3.0 
3.0 

58 

51 

41 

27 

17 

13 

9 

7 

6 

75 
65 
51 
35 
23 
17 
13 
10 
9 

This  table  is  to  be  read  as  follows:  The  fourth  grade  children  at 
Gary  in  their  compositions  made  16.8  minor  errors  (slips)  in  each  1,000 
words  written,  misspelled  57  words  per  thousand,  or  made  a  total  of 
73.8  mistakes  in  spelling  per  thousand  words  written.  In  comparison 
with  the  scores  of  other  grades  and  for  the  purpose  of  smoothing  the 
development  curves,  the  city  wide  scores  for  the  fourth  grade  are  taken 
as  17,  58  and  75  respectively. 

Ki«o  —  j©75-  .925)- 
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Figure  18 
Total  Errors  in  Spelling 
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Grades  are  shown  along  the  horizontal  scale.  The  number  of  words 
misspelled  per  thousand  along  the  vertical  scale.  The  curve  labelled 
"misspellings"  is  based  upon  total  errors;  that  labelled  " slips"  upon  the 
number  of  trivial  errors.  The  true  misspellings  are  represented  by  the 
differences  between  the  two  curves. 

Note  the  great  changes  from  grade  to  grade  and  the  small  number  of 
errors  made  by  high  school  classes.  Analysis  of  the  character  of  these 
errors  would  indicate  that  the  apparent  improvement  is  due  almost 
entirely  to  avoidance  by  children  of  the  use  of  words  which  they  cannot 
spell. 

amounted  to  but  nine  words  misspelled  per  thousand 
words  written.  On  the  basis  of  such  data,  the  spelling 
abilities  of  the  children  in  Gary  would  seem  to  be  prac- 
tically perfect  and  the  results  reported  in  the  previous 
tables  grossly  to  misrepresent  the  true  conditions. 
The  explanation  of  this  apparent  conflict  between  the 
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formal,  and  the  composition,  spelling  tests,  is  a  matter  of 
inference.  Careful  investigation,  however,  makes  it  prob- 
able that  no  contradiction  between  the  results  of  the  vari- 
ous tests  really  exists,  and  that  the  spelling  in  the  com- 
position test  confirms  the  conclusions  drawn  from  the 
results  of  the  formal  tests. 

In  the  first  place  it  must  be  remembered  that  mis- 
spelling one  word  does  not  necessarily  have  the  same 
significance  that  misspelling  another  word  may  have. 
Consequently,  instead  of  accepting  the  coefficients  given 
in  Table  XIX,  page  93,  at  their  face  value,  there  must  be 
proper  evaluation,  both  of  the  words  used  by  the  children 
and  of  the  words  misspelled.    For  instance,  to  say  that 
580  words  out  of  27,610  were  misspelled  is  literally  true, 
but  entirely  misleading.    The  sum  of  the  frequencies  of 
use  of  the  fifty  words  which  occur  most  often  is  14,598 
(Table  XX,  page  97),  or  more  than  50  per  cent,  of  the 
total  number  of  words  used.    Needless  to  say,  misspell- 
ings of  such  simple  words  as  "a,"  "the"  and  "and"  are 
very  few.    If  the  fifty  most  frequent  words  and  their  repe- 
titions were  eliminated  from  the  total  list  of  words  used, 
the  accuracy  of  spelling  would  be  lowered  from  580  words 
misspelled  in  27,610   (98  per  cent,  accuracy)  to  5431 

Words  actually  misspelled  in  remaining  13,012  running  words.  An 
analsyis  of  the  31  words  used  in  the  eighth  grade  formal  spelling  tests 
shows  that  but  3  words  are  rated  by  Jones  (Concrete  Investigation  of 
the  Material  of  English  Spelling.  W.  Franklin  Jones,  University  of  South 
Dakota),  as  of  fourth,  or  lower  grade  difficulty,  7  are  assigned  to  grades 
5  or  6, 13  to  grades  7  or  8,  while  8  of  the  more  difficult  words  do  not  ap- 
pear in  Jones'  list  at  alL 
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words  misspelled  in  13,012  (96  per  cent,  accuracy). 
That  is,  the  composition  test  may  not  measure  ability  to 
spell  on  the  same  level  of  word  difficulty  as  the  formal 
tests. 

On  the  other  hand,  the  Gary  children  in  their  composi- 
tions used  mainly  second  grade  words1  (Table  XXI,  page 
100,  and  Figure  19,  page  101).  Eighty  six  per  cent,  of 
the  running  words  and  53  per  cent,  of  the  different 


*In  the  appendix,  Sec.  IV,  page  406,  will  be  found  a  list  of  the  words 
misspelled  in  the  Gary  eighth  grade  compositions,  excluding  slips,  the 
frequency  with  which  they  were  used,  and  the  frequency  of  their  mis- 
spelling; also  the  grades  in  which  such  words  are  first  used  by  2  per  cent 
of  the  children  in  the  conventional  schools  (according  to  Jones).  The 
reader  should  examine  this  list  and  see  for  himself  the  actual  words  mis- 
spelled by  the  Gary  eighth  grade  children.  The  words  in  Jones'  "second 
grade"  list  are  not  necessarily  easy  words.  In  fact  Jones  states,  "The 
very  words  that  give  most  trouble  in  spelling  are  almost  invariably  found 
in  the  second  or  third  grade  lists,  and  faithfully  reappear  throughout 
the  subsequent  years."  (However,  see  Table  VLSI  B,  Section  IV,  Appendix 
A,  page  414,  for  the  sense  in  which  this  statement  is  to  be  taken.  Even 
the  hundred  "spelling  demons"  are  easy  words  for  the  average  eighth 
grade  class.)  An  analysis  of  the  first  250  words  of  this  report  (page  3) 
shows  43  per  cent,  second  grade  words,  5  per  cent,  third  grade,  3  per  cent, 
fourth  grade,  3  per  cent,  fifth  grade,  and  9  per  cent,  sixth  grade  or  higher, 
10  per  cent.  N.  L.  D.,  27  per  cent.  N.  L.  Hence,  the  importance  of  ex- 
amining the  actual  words  misspelled  by  the  Gary  eighth  grade  children. 
(If  these  results  are  compared  with  those  on  page  230,  the  author  will 
seem  to  be  using  nearly  as  many  second  grade  words  as  the  Gary  eighth 
grade  children.  However,  it  should  be  remembered  that  Table  LVI  is 
based  on  2,500  different  words,  and  the  results  above  on  141.  The 
larger  the  total  number  of  words,  the  smaller  the  percentage  of  second 
grade  words.  Thus,  for  the  396  different  words  used  in  the  first  1,000 
words  of  this  report  the  author's  percentage  of  second  grade  words  drops 
to  35  per  cent.) 
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■ords  are  of  second  grade  difficulty.  Also  55  per  cent, 
f  the  376  different  words  misspelled  and  59  per  cent,  of 
le  580  misspellings  were  words  which  are  classed  as 
:eond  grade  words  in  Jones'  vocabularies.  Whenever 
tiey  did  have  occasion  to  use  more  difficult  words,  these 
'ords  were  usually  misspelled.  Thus  only  n  different 
tads  ranked  as  eighth  grade  words  by  Jones  were  used 
1  the  Gary  eighth  grade  compositions.  These  11  words 
'ere  used  a  total  of  eighteen  times  and  misspelled  fifteen 
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times,  so  that  the  average  accuracy  of  spelling  was  about 
17  per  cent. 

It  will  be  remembered  that  one  of  the  spelling  dictation 
tests  given  to  the  eighth  grade  (Test  3,  page  88)  was 
composed  of  20  words  chosen  from  sets  S-T-U  of  Ayres' 
Scale.    When  the  eighth  grade  misspellings  were  analyzed 
in  terms  of  the  Ayres*  Scale  (Table  XXVII,  page  127), 
it  was  found  that  18  words  almost  equally  distributed 
among  sets  S-T-U  of  Ayres*  Scale  had  been  used  spon- 
taneously by  the  children  in  their  compositions.   In  other 
words,  these  18  words  constitute  a  spontaneous,  self- 
imposed  spelling  test  comparable  with  formal  dictation 
test  No.  3,  the  only  difference  being  that  the  formal 
test  was  given  to  127  eighth  grade  children,  while  at  most 
only  half  this  number  used  the  words  spontaneously. 
The  correspondence  in  the  results  is  almost  perfect,  as  is 
shown  in  the  following  table  which  gives  the  average 
accuracy  of  spelling  in  the  two  tests: 


. ■w— 

GARY  SCORE 

AYRES' 
STANDARD 

■^ 

DIFFERENCE 

Formal  Test 

69 
71 

88 
88 

—19 

Spontaneous  Test 

—17 

The  apparent  contradiction  between  the  formal  tests 
and  the  composition  test  is  thus  due  to  the  fact  that  the 
Gary  children  in  their  compositions  did  not  use  many 
words  of  the  same  level  of  spelling  difficulty  as  the  for- 
mal tests. 

However,  there  is  a  further  point  to  be  considered. 


s 
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The  fact  remains  that  the  Gary  eighth  grade  children 
made  very  few  spelling  mistakes  in  writing  their  com- 
positions. The  question  arises,  "Is  this  favorable  re- 
sult to  be  credited  to  the  Gary  teaching,  or  does  it  mean 
that  in  the  absence  of  effective  training  the  Gary  children 
merely  followed  a  natural  tendency  to  avoid  words  which 
they  did  not  know  how  to  spell?  " 

Unfortunately,  the  comparative  data  which  are  neces- 
sary to  determine  this  point  are  not  available,  but  one 
item  of  the  results  appears  significant.  One  hundred 
and  fifty-eight,  or  42  per  cent,  of  the  words  misspelled 
in  the  eighth  grade  compositions  were  misspelled  also 
in  the  fourth  grade  compositions.  That  in  spite  of  the 
normal  increase  in  vocabulary  which  takes  place  from 
the  fourth  to  the  eighth  grade,  so  large  a  proportion  of  the 
misspellings  should  be  words  which  have  been  used,  and 
used  repeatedly,  through  four  years  of  school  work,  tends 
to  confirm  the  inference  in  regard  to  the  failure  of  the 
Gary  training,  and  to  suggest  that  the  apparent  favor- 
able results  are  in  no  way  either  real  or  a  credit  to  the 
Gary  system.1 

In  other  words,  the  Gary  eighth  grade  children  in  their 
compositions  gave  ample  proof  that  they  were  unable 
to  spell  as  well  as  the  children  in  conventional  schools  the 
words  of  the  Ayres'  Scale  which  have  been  shown  to  be 


1The  effect  might  possibly,  though  not  probably,  be  due  to  the  over- 
lapping of  grades,  or  the  presence  in  the  eighth  grade  of  many  children  of 
fourth  grade  ability,  and  the  presence  in  the  fourth  grade  of  many 
children  of  eighth  grade  ability. 
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FIGURE  19 

Character  or  the  Words  Misspelled  in  the  Eighth  Grade 

Compositions 
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The  small  square  indicates  the  scale  upon  which  the  remaining  figures 
are  drawn  and  represents  five  words.  The  total  area  of  all  the  other 
figures  represent  6,426  running  words,  the  total  frequency  of  use  of  the 
words  which  were  misspelled  by  some  member  of  the  group.  The  large 
square  represents  the  words  which  are  classified  as  second  grade  words  by 
Jones.  There  were  209  second  grade  words  misspelled  by  the  eighth  grade 
children.  These  words  were  used  a  total  of  5,746  times,  and  were  mis- 
spelled 344  times.  The  accuracy  of  the  eighth  grade  spelling  of  the 
second  grade  words  was,  therefore,  94  per  cent.  The  remaining  diagrams 
show  the  relative  frequency  with  which  words  classified  by  Jones  as  third, 
fourth,  and  fifth  grade,  etc.,  were  used,  and  the  relative  accuracy  of 
spelling.  UNL  means  words  which  were  not  listed  by  Jones.  D  means 
derivatives,  and  words  not  counted  because  proper  nouns,  etc. 

The  figure  as  a  whole  shows  that  the  words  misspelled  by  the  eighth 
grade  children  were  mainly  second  grade  words,  and  that  as  the  diffi- 
culty of  the  words  increased,  the  accuracy  of  the  spelling  decreased.  The 
exception  in  the  seventh  grade  is  due  to  the  frequent  use  of  the  words 
"exciting"  and  "accident"  which  were  written  on  the  board  by  the  exam- 
iner in  giving  the  instructions  for  the  test. 
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the  most  frequently  used  words  in  the  English  lan- 
guage. 

SCHOOL  TO  SCHOOL  COMPARISONS 

Very  small  differences  were  found  in  spelling  ability 
from  school  to  school.  In  general,  the  Beveridge  and 
Jefferson  schools,  which  are  less  completely  equipped  to 
carry  out  a  modem  program,  do  rather  better  in  spelling 
than  the  two  larger  schools,  Emerson  and  Froebel, 


table  xxn 
Class  Deviations1  fbou  Generalized  Cm  Scon 
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'Nate  that  in  thii  division  of  the  libit  the  plui  ip  Indicates  lea  ability  in  atxJlinc; 
far  the  huger  the  coeflidenl  the  grater  the  number  ofmatikei  nude. 

This  table  is  to  be  read  as  follows:  Of  ji  classes  .in  the  Froebel 
school  measured  in  spelling  by  the  dictation  test,  7  classes  were  markedly 
above  the  corresponding  city  scores,  and  13  below.  In  the  list  test 
6  classes  were  above  and  11  below.  In  the  composition  test  1  dan 
had  smaller  coefficients  than  the  average  for  the  city  and  14  larger. 
That  is,  in  general,  the  children  of  the  Froebel  school  are  consistently  below 
average  of  the  city  in  spelling  ability,  as  was  to  be  expected  because  of 
the  greater  amount  of  foreign  parentage. 
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although    the   differences    are    relatively   insignificant 
(Table  XXII,  page  102). 

§2.    Critical  Discussion 

spelling  ability 

A  spelling  test  is  popularly  supposed  to  measure  a  child's 
"general  ability  to  spell."  Many  persons  speak  of  "  learn- 
ing to  spell"  as  if  school  training  had  for  its  purpose  the 
development  of  a  general  ability  to  Spell  any  word  without 
regard  to  whether  the  word  had  ever  been  seen  or  spelled 
before.  It  is  easy  to  show  by  experiment,  however,  that 
well  trained  adults  can  spell  simple  phonetic  words  which 
they  have  never  seen  or  heard  before  only  with  an  ac- 
curacy of  (approximately)  30  per  cent.1  Therefore,  the 
existence  of  general  ability  to  spell  may  well  be  doubted. 

The  primary  purpose  of  spelling  is  to  make  a  written 
record  of  the  sounds  used  in  oral  language.  To  be  sure, 
the  word  may  not  have  been  sounded  by  the  writer, 
and  may  not  be  sounded  by  the  reader,  but  all  writing 
is  capable  of  being  translated  into  sound;  conversely, 
all  oral  language  may  be  recorded  by  means  of  appro- 
priate letters  or  groups  of  letters.  In  an  ideal  system  of 
phonography  in  which  each  sound  would  be  represented 
by  a  single  letter,  and  each  letter  by  a  single  sound,  a 
person  thoroughly  skilled  in  the  use  of  the  system  would, 
conceivably,  be  able  to  spell  any  word  correctly,  whether 

'From  an  unpublished  Study  by  the  Department  of  Educational 
Research,  Detroit  Public  Schools. 


analyzing  spoken  words  into  their  sc 
in  representing  each  element  by  its  a 
the  person's  spelling  ability  would  als 

Unfortunately,  English  words  are  c 
sources  and  English  lacks  an  absolute 
English  is,  perhaps,  more  illogical  ii 
any  other  of  the  modern  languages, 
in  English  at  present  means  merely  tl 
duce   certain   conventional   symbols 
given  words.    The  fact  that  some  entii 
of  many  words,  have  a  phonetic  basi 
the  confusion.    It  may  be  true  that 
makes  it  easier  to  learn  to  spell  new  w 
general  ability  to  spell  is  concerned,  ou 
system  probably  adds  to  the  difficu 
tion. 

Hence  in  the  English  language  it  is 
to  measure  directly  a  child's  general 
The  most  that  can  be  done  is  to  meas 
spell  certain  specific  words.    In  other  ^ 
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correctly  words  of  equal  or  less  difficulty.  The  spelling 
of  each  word  stands  by  itself.  It  is,  of  course,  possible 
to  test  an  individual  with  a  large  number  of  representa- 
tive words  and  so  finally  determine  the  range  of  his 
spelling  ability,  but  general  ability  to  spell  is  seldom 
used  with  this  meaning. 

This  point  is  so  important  and  yet  so  often  misunder- 
stood that  it  merits  more  extended  discussion.  Proof 
of  its  truth  will  be  shown  by  analysis  of  the  results  of 
tests  of  certain  classes  at  Gary.  In  grade  7,  class  No. 
13  Emerson  and  class  No.  14  Jefferson  made  almost 
exactly  the  same  average  score  in  accuracy  (71-70  per 
cent,  respectively)  in  spelling  20  given  words,  while 
class  No.  44  Froebel  made  a  very  much  lower  score 
(52  per  cent.)  in  spelling  the  same  twenty  words.  Four- 
teen of  the  twenty  words  were  taken  from  Ayres*  Scale, 
Set  U,  as  being  words  of  equal  spelling  difficulty  as 
determined  by  the  actual  performances  of  many  thou- 
sands of  children.  Three  words  were  taken  from  the 
next  easier  set  (T)  and  three  from  the  next  more  difficult 
set  (V).  However,  a  mere  glance  at  an  analysis  of  the 
results  by  words  is  all  that  is  necessary  to  show  that  while 
the  words  may  be  equal  for  seventh  grade  children  in 
general,  they  are  most  certainly  not  equal  for  the  children 
in  these  classes.  The  accuracies  on  individual  words 
vary  from  95  per  cent,  to  21  per  cent.  (Table  XXIII, 
page  107).  Even  if  the  analysis  be  restricted  to  the  re- 
sults of  the  class  making  the  best  record,  a  variation 
from  word  to  word  of  35  per  cent,  will  be  found.    The 
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the  two  classes  can  spell  8  out  of  20  1 
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The  reader  should  note  that  the  average  so 
on  the  entire  twenty  words  was  about  52  per  ( 
standard,  while  class  No.  14  Jefferson  and  c 
closely  the  same  score  (70  and  71  per  cent.), 
actually  spelled  "respectfully"  and  "celebrati 
No.  13  Emerson,  although  for  twelve  of  the  v 
Froebel  fall  very  much  below  the  scores  of  the 

of  equal  difficulty.    The  spelling  of 
learned  individually. 

It  may  be  objected  that  no  word 
entire  class,  just  as  no  word  was  con 
entire  class,  so  that  in  one  sense  the 
that  there  are  larger  differences  in  ab 
words.    From  this  point  of  view,  abili 
word  would  depend  partly  upon  gen< 
and  partly  upon  a  direct  knowledge 
word.    That  is,  of  course,  true.    ' 
word  of  all  was  spelled  correctly  by 
least  able  class.    However,  "genera 
is  seldom  interpreted  in  such  a  sen 
children  reaching  the  seventh  grade 

many  contacts  with  common  words 
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The  scale  along  the  base  of  the  figure  shows  the  various  words.  The 
scale  along  the  vertical  axis  of  the  figure  shows  the  accuracy  with  which 
the  different  words  were  spelled.  The  words  are  arranged  in  the  order 
of  the  accuracy  with  which  they  were  spelled  by  Class  B.  That  is, 
"folks"  was  the  easiest  word,  "suggest"  was  the  next  harder  word,  and 
"elaborate  "  was  the  hardest  word  of  all. 

The  solid  line  (A)  represents  results  from  Class  No.  13,  Emerson, 
seventh  grade.  The  dotted  line  (B)  represents  results  from  Class  No.  44, 
Frocbcl,  seventh  grade. 

Average  accuracy  of  spelling  based  on  entire  twenty  words:  Class  A 
— 71  percent.,  Class  B— 52  percent 

The  curves  show  that  the  order  of  difficulty  of  words  for  Class  B  was 
not  that  for  Class  A.  The  words  in  the  test  arc  given  as  of  nearly  equal 
difficulty  by  Ayres.  For  these  two  classes  they  range  in  difficulty  from 
os  per  cent,  to  11  per  cent.  The  Curves  also  show  that,  although  the 
two  classes  make  very  closely  the  same  scores  on  the  first  eight  words. 
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Figure  20 — Continued 

they  differ  widely  on  the  remaining  twelve  words.  Note  that  for  the 
two  words  "respectfully'1  and  "celebration"  the  poorer  class  makes 
higher  scores  than  the  other. 

has  a  certain  vague  "general  ability"  to  spell.  In  this 
sense  the  term  means  merely  that  in  such  a  group  there 
are  sure  to  be  some  children  who  can  spell  at  least  part 
of  a  group  of  common  seventh  grade  words. 

To  most  persons,  however,  general  spelling  ability 
is  a  term  applied  to  the  ability  of  individuals  and  implies 
that  the  individual's  ability  to  spell  seventh  grade  words 
in  general  may  be  determined  by  having  him  spell  a 
random  sampling  of  seventh  grade  words.  The  results 
shown  above  indicate  clearly  that  inferences  from  one 
set  of  words  to  another  set  of  words  may  be  in  error  by 
amounts  equal  to  three  times  the  average  yearly  progress 
from  grade  to  grade.  Under  the  circumstances,  each 
test  consisting  of  but  ten  to  twenty  words  must  be  con- 
sidered by  itself  as  a  test  of  ability  to  spell  certain  words 
only. 

The  impossibility  of  making  inferences  in  regard  to  an 
individual's  ability  to  spell  a  word  from  his  performance 
in  spelling  some  other  word  of  equal  difficulty  is  brought 
out  plainly  by  a  study  of  the  records  for  a  single  class. 
For  instance,  in  class  No.  13  Emerson  (seventh  grade) 
each  word  of  the  20  in  the  test  was  missed  by  at  least 
3  of  the  37  children,  while  no  word  was  missed  by 
more  than  16  children.  On  the  other  hand,  only  2 
of  the  children  spelled  all  20  words  correctly,  and  of  the 
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The  leak  along  the  base  of  the  figure  shows  the  various  words.  The 
scale  along  the  vertical  axis  of  the  figure  shows  the  accuracy  with  which 
the  different  words  were  spelled.  The  order  of  words  is  the  same  as  In 
Figure  ao,  and  Curve  A  is  taken  directly  from  that  figure.  {The 
comparison  in  Figure  »  was  between  two  seventh  grade  classes  having 
very  different  average  scores.)  In  this  figure  the  comparison  is  between 
classes  having  almost  the  identical  average  score.  In  spite  of  this  fact, 
the  curve  shows  that  the  scores  for  individual  words  vary  greatly.  For 
instance,  Class  C  spells  "citizen"  very  much  better  than  class  A,  but 
"arrangement"  very  much  more  poorly. 

The  solid  line  (A)  represents  results  from  class  No.  13  Emerson,  seventh 
grade,  as  in  Figure  m.  The  dotted  line  (C)  represents  results  from  class 
No.  14  Jefferson,  seventh  grade. 

Average  accuracy  of  spelling  based  on  entire  twenty  words:  Class  A— 
71  per  cent.,  class  C — 70  per  cent. 


Three  out  of  the  37  children  in  tl 
70  per  cent.,  while  the  class  aver; 
That  is,  each  of  the  three  missed  6 
did  all  three  miss  the  same  wore 
did  two  of  the  three  make  the  sam< 
page  115).    The  variation  is  so  extr 
the  different  words  misspelled  by  thi 
age"  ability  is  to  list  14  out  of  1 
test. 

Similar  records  could  be  shown  inc 
be  evident,  therefore,  that  each  tes 
only  the  ability  to  spell  the  words  11 
the  conditions  under  which  it  was  gi 

On  the  other  hand,  as  data  accui 
dendes  may  become  apparent,  anc 
may  be  safely  made  in  regard  to  tfc 
teaching  effort  •'  It  should,  however, 
that  such  inferences  are  inferences  0 
choice  of  words  or  a  different  form  of 
different  results.  When,  however,  m< 
test  is  used  and  more  than  one  choi< 
without  bringing1  to  Kcrhf   •••••    — 
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"arrangement."     The  individual  who  had  the 
in  the  class  spelled  correctly  words  of  every  Ie\ 
easiest  to  the  third  hardest,  and  also  missed 
difficulty,  from  the  second  easiest  to  the  hardest 
shown  in  the  other  two  records. 

The  reader  should  note  also  that  judged  by 
class  score  is  higher  than  Ayres'  standards,  alt 
the  class  score  is  from  15  to  20  per  cent,  below  A 

inferences  gain  a  reliability  they  c 
possess  if  they  were  based  upon  a  sing 
"general  ability  of  the  Gary  children  i 
used  throughout  the  report  to  mean  t 
drawn  from  the  series  of  measuremen 
spell  particular  words. 

As  a  spelling  test  by  itself  is  a  re 
children's  performances  only  for  the 
selection  of  test  words  becomes  an  ii 
It  would  be  as  unfair  to  test  a  school  s1 
and  difficult  words  as  it  would  be  to  1 
only  those  which  had  recently  been  ta 
room.    Fortunately,    in    all    written 


TABLE  XXV 
Misspellings  of  These  Pupils1 


AVERAGE 

AVERAGE 

INDIVIDUAL   RECORDS 

WORD 

accubacv 

CLASS 

AYBES 

ACCURACY 

A 

a 

c 

1 

Senate 

73% 

68% 

0% 

0% 

100% 

2 

Majority 

73 

69 

0 

0 

100 

3 

4 

Necessary 
Celebration 

73 
79 

69 
66 

0 
100 

100 
0 

0 
0 

5 

Mere 

73 

68 

0 

100 

100 

6 

Respectfully 

73 

78 

0 

100 

100 

7 

Testimony 

66 

66 

0 

100 

100 

8 

Elaborate 

73 

67 

100 

0 

100 

9 

Discussion 

66 

73 

100 

0 

100 

10 

Arrangement 

66 

67 

100 

0 

100 

11 

Citizen 

73 

78 

100 

100 

0 

12 

Agreement 

73 

70 

100 

100 

0 

13 

73 

68 

100 

100 

0 

14 

Suggest 

73 

92 

100 

100 

0 

•Thar  popfk  bad  tbc  nut  uxuncj  Kan  u  the  dm  (70-71  per  cent . ). 

This  table  is  to  be  read  as  follows:  The  word  "senate"  whose  aver- 
age accuracy  of  spelling,  according  to  Ayres,  was  73  per  cent.,  was 
spelled  by  the  class  with  an  accuracy  of  68  per  cent.  It  was  misspelled 
by  both  individuals  A  and  B,  but  spelled  correctly  by  C.  The  reader 
should  note  that  individuals  A,  B,  and  C  have  the  same  average  accuracy 
on  the  10  words  (70  per  cent.),  and  this  accuracy  is  the  same  as  the 
average  made  by  the  class  as  a  whole  on  the  so  words  (71  per  cent.). 
The  three  individuals,  however,  did  not  miss  the  same  word,  and  in 
only  four  cases — the  first  four  words  in  the  table — did  two  of  them  miss 
the  same  word.  To  record  the  different  words  misspelled  by  these 
three  individuals  of  average  ability  it  is  necessary  to  list  14  words, 
although  each  of  the  children  misspelled  but  6  words. 


ii^ito  opening  oLU.it;. 

LIST  TESTS 

The  conventional  form  of  school  ex* 
ability  is  spelling  words  in  lists.  It 
in  daily  life.  When  a  list  of  words 
the  child  has  time  to  recall  conscious! 
of  spelling  a  word.  He  has  oppoii 
by  rule,  by  guess  work,  or  by  reas< 
both  the  letters  he  uses  and  their  or 
tually  written  on  the  paper  tells  mer 
process  employed  by  the  child — a 
reason,  or  guess — has  in  the  partial! 
the  correct  result. 

Spelling  in  daily  life,  however,  has 
If  one  is  conscious  that  he  is  uncertai 
a  word,  he  consults  a  dictionary.  r 
made  are  thus  in  automatic,  uncons< 
the  attention  is  concentrated  upon 
expressed  so  that  the  errors  are  un 
seem,  therefore,  that  a  test  to  be  a  i 
ability  must  be  given  under  such  c 
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spelling  habits  when  writing  freely  are  correct.    This  is 
the  reason  for  the  use  of  more  than  one  test. 

DICTATION  TESTS 

The  timed  dictation  of  sentences  is  an  unusual  form  of 
spelling  test  and  a  few  points  in  regard  to  their  construc- 
tion need  explanation. 

The  tests  consisted  of  ten  sentences,  each  containing 
two  test  words.  The  sentences  in  any  one  test  were 
made  of  equal  length  within  approximately  five  letters 
and  care  was  taken  to  employ  no  words  of  greater  spelling 
difficulty  than  the  test  words.  The  rate  of  dictation  was 
controlled,  each  grade  being  given  the  material  and  rate 
corresponding  to  the  median  rate  of  free  writing  at  its 
grade.  That  is,  the  test  was  given  for  the  purpose  of 
determining  how  many  of  the  children  could  spell  the 
given  words  when  writing  rapidly. 

A  defect  in  this  series  of  tests  was  that  while  most  of  the 
sentences  were  natural,  a  few  were  markedly  artificial, 
due  to  the  necessity  of  using  certain  words.  Again, 
some  of  the  test  words  occurred  at  the  end  of  the  sen- 
tence. The  child  who  is  naturally  slow,  and  who  in  such 
a  test  is  compelled  to  write  at  a  rate  higher  than  his 
natural  rate,  tends  to  omit  the  last  words  of  the  sentence. 
Such  omissions  were  counted  as  misspellings.  The  test 
words  should  all  have  occurred  at  the  beginning  or  middle 
of  sentences. 

Tabulations  were  made  of  the  number  of  words  omitted 
by  children  in  the  classes  making  the  lowest  scores  in 
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Table  XXVI— Continued 

The  table  is  to  be  read  as  follows:  In  the  second  grade  class  having 
the  lowest  accuracy  there  were  20  children.  Of  these,  19  misspelled  one 
or  more  words,  18  omitted  one  or  more  words,  3  wrote  one  or  more 
words  illegibly,  7  substituted  words  for  the  words  pronounced  by  the 
teacher.  On  20  words  the  average  mistakes  in  spelling  were  6.9  words, 
in  omissions  8.5  words,  in  substitutions  .6  of  a  word,  in  illegibility  .9 
of  a  word.  Total  errors  were  16.9,  making  the  average  accuracy  15.5 
per  cent.  The  omissions  and  substitutions  are  large  only  in  those  classes 
where  the  average  accuracy  is  lower  than  50  per  cent.  Note  the  record  of 
even  the  second  grade  class  when  the  class  average  was  57.5  per  cent. 

each  grade  (Table  XXVI,  page  118).  The  maximum 
effect  due  to  the  omission  of  words  is  less  than  one  word, 
or  5  per  cent.,  except  in  classes  where  the  accuracy  of 
spelling  falls  below  50  per  cent.  In  such  cases,  as  Ayres 
has  pointed  out,  the  test  ceases  to  be  a  spelling  test  and 
becomes  a  guessing  contest.  Only  when  more  than  half 
the  children  in  a  class  are  unfamiliar  with  a  word  are 
the  omissions  large.  From  the  scores  of  children  in 
other  cities,  it  was  not  expected  that  the  scores  of  the 
Gary  children  in  any  test  would  fall  below  the  50  per 
cent,  level.  Actual  conditions  were  not  foreseen  in 
planning  the  test. 

It  must  be  remembered,  also,  that  a  large  number  of 
the  omissions  are  really  due  to  misspellings.  When  a 
child  has  to  "stop  and  think"  how  a  word  is  spelled, 
he  is  not  able  to  spell,  as  spelling  is  defined  in  this  test. 
The  omission  of  words  to  catch  up  is,  therefore,  equiva- 
lent to  misspelling.  There  may  be  some,  however,  who 
are  unwilling  to  accept  this  point  of  view.    They  should 
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base  their  conclusions  on  the  list  tests  which  are  untimed. 
Nevertheless,  it  is  the  opinion  of  the  writer  that  the  real 
value  of  a  spelling  test  does  not  lie  in  the  certainty  with 
which  the  children  show  their  maximum  ability,  but  in 
the  certainty  with  which  it  points  out  those  children 
who  fail  to  spell  correctly  in  their  ordinary  written  work. 

COMPOSITION  TEST 

In  checking  the  results  of  the  different  examiners  for 
the  same  papers  in  the  composition  test,  it  was  noted 
that  there  were  frequent  disagreements  as  to  which 
words  were  misspelled.  Some  of  the  differences  were 
due  to  mere  errors  in  reading  on  the  part  of  the  scorers, 
but  many  were  due  to  the  inability  of  the  examiners 
to  agree  as  to  what  constituted  misspelling.  Finally, 
after  repeated  conferences  of  the  scorers,  definitions  of 
misspellings  and  a  number  of  arbitrary  rules  were 
adopted.  The  papers  were  then  scored  independently 
by  two  observers.  Each  of  the  two  made  a  list  of 
misspelled  words  without  in  any  way  marking  the  papers 
himself.  The  two  lists  were  then  compared  by  a  third 
person,  and  all  differences  in  scoring  by  the  first  two  ex- 
aminers were  checked  by  this  third  person  by  reference 
to  the  original  papers.  In  this  way  the  scoring  was  made 
reliable,  although  perfection  in  scoring  proved  to  be 
very  difficult  to  attain.    In  considering  results  from 


'Those  interested  in  this  point  should  note  the  correlations  given  in 
Table  XXX,  page  133. 
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similar  tests  in  other  school  systems  as  a  basis  for  com- 
parison, it  would  be  necessary  to  inquire  whether  equal 
care  had  been  taken  before  such  comparison  could  be  con- 
sidered valid. 

The  definition  of  misspelling  finally  adopted  was  the 
following: 

Any  variations  of  the  character,  number,  or  order  of 
the  letters  called  for  in  a  given  situation,  except  those 
which  are  plainly  caused  by  grammatical  errors,  are  to  be 
considered  misspellings  and  marked  "E." 

The  part  of  this  definition  to  be  particularly  noted  is 
that  the  scorers  were  unable  to  decide  whether  or  not  a 
given  word  was  misspelled  by  examination  of  the  letters 
alone.  The  whole  situation  had  to  be  considered.  This 
will  become  plain  as  each  qualifying  rule  is  discussed. 
The  following  rules  were  adopted  and  used  in  marking  the 
papers: 

(i)    Make  no  record  of  any  questionable  grammatical 
errors. 

If  a  child  wrote,  "I  only  done  my  duty," 
the  substitution  of  "done"  for  "did"  was 
considered  a  grammatical  error,  and  not 
an  error  in  spelling. 

In  cases  of  doubt,  the  rule  gives  the 
child  the  benefit  of  the  doubt. 
(2)     Regard  all  words  in  which  illegible  letters  occur  as 
misspelled,  marking  them  "L". 

Illegibility  caused  by  poor  writing  thus 
operates  to  increase  misspelling. 
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(3)  Errors  in  simple  words  caused  by  the  addition  or 
omission  of  common  letters,  prefixes  or  suffixes, 
are  to  be  counted  as  slips  and  marked  "S." 

Illustration:  "I  walk  home  an  ate 
my  dinner,"  the  omission  of  "ed"  on 
walked  and  of  "d"  on  and  will  be  marked 
"S"  for  slips. 

A  slip  is  caused  by  some  form  of  inatten- 
tion and  is  ordinarily  a  defect  of  the  writing 
activity.  It  may  not  indicate  lack  of 
knowledge  of  how  to  spell,  but  is,  never- 
theless, an  error  in  spelling.  The  ex- 
aminers were  divided  in  their  opinions. 
Some  contended  that  spelling  ability  is 
to  be  measured  not  by  slips,  but  by  errors 
in  words  of  importance.  The  matter  was 
adjusted  as  follows:  Every  error  of  any 
sort  was  noted.  One  tabulation  was  made 
of  the  total  errors,  and  a  second  one  of  the 
slips  alone.  It  is  thus  possible  for  the 
reader  to  decide  the  point  for  himself. 
But  whether  slips  are  to  be  counted  as 
misspellings  or  not,  it  is  certain  that  they 
reveal  one  characteristic  of  the  training. 
A  well  trained,  careful  child  does  not  make 
such  slips  any  more  than  he  misspells 
words.  A  change  in  the  number  of  slips 
from  grade  to  grade  is  thus  as  much  of  an 
indication  of  the  efficiency  of  training  as 
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a  change  in  the  number  of  words  mis- 
spelled. 

Since  a  certain  element  of  subjective 
judgment  had  to  be  reckoned  with  in  the 
determination  of  which  mistakes  were  to 
be  considered  slips,  all  doubtful  cases  were 
passed  upon  by  three  judges.  No  hard 
and  fast  line  could  be  drawn  between 
true  misspellings  and  slips,  but  the  re- 
strictions given  in  the  rule  were  adhered  to 
throughout. 

(4)  The  substitution  of  one  homonym  for  another  is 

to  be  counted  as  a  misspelling. 

Illustration:  Their  for  there,  fairy  for 
ferry,  were  counted  misspellings,  even 
though  the  words  actually  appearing  on 
the  papers  were  correctly  spelled. 

(5)  The  substitution   of  one   word   for  another,   as 

lighting  for  lightening,  is  to  be  decided  in  each 
case  on  its  merits. 

If  the  context  shows  that  the  child 
made  a  mistake  in  the  selection  of  the 
word  to  express  his  thoughts,  no  record 
is  to  be  made  of  the  mistake;  but  if  the 
word  in  question  is  incorrect  because  of 
faulty  spelling,  it  is  to  be  counted  as  a 
misspelling,  and  marked  "M". 

Decision  in  these  cases  again  involved 
subjective  judgment,  but  as  above,  ques- 
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tionable  cases  were  passed  by  three  judges. 
The  number  of  such  substitutions  was 
not  large. 

(6)  Slight  changes  in  common  words  will  be  marked 

as  misspellings  and  not  as  slips. 

For  example:  than  for  them;  were  for 
where ,  etc, 

(7)  The  omission  of  hyphens  in  compound  words,  the 

separation  of  words  into  parts,  as  "some  where/' 
or  the  faulty  division  of  words  at  the  end  of  a 
line,  are  not  to  be  counted  as  misspellings  if  both 
parts  of  the  word  are  correctly  spelled. 

(8)  Proper  names  of  unknown  persons  will  be  accepted 

as  spelled  correctly,  but  all  misspellings  of  well 
known  proper  names  will  be  recorded. 

"Wasington"  as  the  name  of  a  boy 
playmate  was  accepted  as  correct,  but 
as  the  name  of  the  first  President  of 
the  United  States  was  counted  mis- 
spelled. 

(9)  Faulty  use  of  capitals,  omissions  of  dots  over  "iV 

or  crosses  over  "t's"  are  not  to  be  counted  either 

as  misspellings  or  slips. 
After  the  spelling  errors  in  the  compositions  had  been 
determined  by  the  above  rules,  the  tabulations  brought 
up  an  added  difficulty.  Children  wrote  papers  of  vary- 
ing lengths,  so  that  the  possibility  of  error  differed 
greatly.  To  reduce  the  results  to  a  comparable  basis,  a 
coefficient  of  misspellings  was  computed  for  each  child. 
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rhis  coefficient  was  taken  as  the  number  of  words  mis- 
spelled per  thousand  words  written;  that  is,  the  actual 
number  of  errors  in  spelling  was  divided  by  the 
total  number  of  words  written,  and  the  division  car- 
ried to  the  nearest  thousandth.  A  tabulation  of  a 
random  selection  of  cases  proved  that  there  was  no 
relation  between  the  length  of  a  paper  and  the  size  of 
the  coefficient  of  misspellings:  consequently  the  values 
thus  determined  were  accepted  as  a  measure  of  spelling 
ability. 

It  should  be  remembered,  however,  that  such  a  measure 
is  a  gross  measure  only.  To  misspell  a  word  written 
correctly  by  most  of  the  other  children  in  a  grade  is  a 
much  more  serious  error  than  misspelling  an  unusual 
word.  A  truly  significant  coefficient  of  spelling  ability 
should  be  based  upon  the  relative  seriousness  of  the 
various  misspellings.  However,  as  no  information  bear- 
ing on  this  point  was  available,  the  gross  coefficient  was 
lecepted  at  its  face  value.  Hence  the  actual  misspellings 
were  also  listed  and  analyzed. 

An  analysis  of  the  words  misspelled  by  the  Gary  eighth 
grade  children  in  their  compositions  on  the  basis  of 
Jones'  vocabulary  list  is  given  in  §  i  of  this  chapter.  A 
similar  analysis  was  made  on  the  basis  of  Ayres*  Spelling 
Scale.  There  are  fewer  words  common  to  the  Ayres 
Scale  and  the  compositions  than  there  are  to  Jones' 
lists  and  the  compositions,  but  the  tabulation  fully  con- 
inns  the  conclusions  reached  previously  (Table  XXVII, 
jage  127).     The  results  show  plainly  that  as  the  difficulty 
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of  the  words  increases  the  Gary  scores  fall  more  and  more 
below  Ayres*  standard. 

It  may  be  objected  that  Ayres'  standards  are  too 
high.  They  represent,  however,  the  average  scores 
made  by  the  children  of  eighty  four  representative 
American  cities,  and  are  not  standards  arbitrarily  set. 
To  be  sure,  the  tests  by  which  the  standards  were  de- 
termined were  given  under  conditions  no  more  uniform 
than  can  be  secured  by  the  transmission  of  instructions 
in  correspondence.  The  standards  have,  however, 
been  repeatedly  checked  by  subsequent  tests  under 
carefully  controlled  conditions  and  have  proved  valid 
(Table  XXVIII,  page  128).  The  values  set  by  Ayres  ap- 
proximate the  values  which  represent  the  average  spelling 
ability  of  children  in  various  cities  of  the  United  States. 
The  differences  from  Ayres*  standards  are  small,  and 
vary  from  grade  to  grade.  In  some  cases  they  are  nega- 
tive, and  in  others  positive.  Many  cities,  indeed,  show 
an  average  difference  which  is  positive,  and  in  amount 
equal  to  from  4  per  cent,  to  12  per  cent.  On"  the 
whole,  therefore,  comparisons  with  Ayres*  standards  are 
valid. 

RANGE   OF  ABILITY 

In  discussing  the  results  of  educational  tests  the  varia- 
tion of  individual  ability  within  the  grade  is  almost  as 
much  a  matter  of  concern  as  the  class  score.  A  well 
graded,  efficient  system,  meeting  the  needs  of  individuals 
at  every  turn,  may  be  expected  to  have  compact  grades 
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of  low  variability.  The  range  of  variation  within  the 
grades  at  Gary  is  very  great  and  the  coefficients  of  varia- 
bility large  (Table  XXIX,  page  130). 

RELATION  BETWEEN  TESTS 

A  series  of  measurements  of  a  single  type  of  ability 
affords  an  opportunity  for  a  study  of  the  relation  be- 
tween the  tests  used.  A  number  of  interesting  questions 
suggest  themselves.  Are  the  children  who  make  low 
scores  in  the  formal  spelling  tests  those  who  misspell  the 
largest  number  of  words  in  their  compositions?  Will  a 
dictation  test  or  a  list  test  reveal  most  accurately  the 
children  who  make  errors  in  spelling  in  their  compositions? 

The  scores  of  the  children  in  the  list  and  dictation  tests 
show  a  greater  correspondence  than  do  the  scores  from 
any  other  two  of  the  tests  (Table  XXX,  page  133). 
Eighty  eight  per  cent,  of  the  42  eighth  grade  children 
present  for  all  tests  maintain  their  place  in  the  two  dis- 
tributions within  one  unit  of  variability.  That  is,  9  chil- 
dren out  of  10  will  be  as  much  above  or  below  the  median 
of  the  group,  relatively,  in  one  test  as  in  the  other.  The 
reader  will,  of  course,  note  that  the  median  deviation  for 
these  two  tests  was  24  per  cent,  and  20  per  cent,  respec- 
tively. If,  however,  the  limit  of  variation  be  reduced  to 
half  that  figure,  60  per  cent,  of  the  children  will  be  found 
to  maintain  the  same  relative  place  in  the  distribution 
whether  measured  by  one  test  or  the  other. 

The  correspondence  between  actual  mistakes  in  spell- 
ing and  the  mistakes  in  the  formal  tests  is  such  that  three 
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fourths  of  the  children  maintain  their  relative  places  in 
the  two  distributions.  The  correspondence  is  slightly 
greater  between  the  misspellings  in  the  compositions 
and  the  dictation  test1  than  it  is  for  the  list  test,  but  the 
differences  are  scarcely  significant  If  slips  alone  be 
considered,  the  correspondence  with  the  scores  in  the 
dictation  test  is  considerably  closer  than  with  the  scores 
in  the  list  tests,  but  if  the  total  errors  are  to  be  considered, 
the  relative  merits  of  the  two  tests  are  exactly  reversed. 
That  is,  the  children  who  make  slips  in  their  compositions 
are  the  children  who  misspell  in  the  dictation  test,  rather 
than  those  who  miss  in  the  list  test;  but  if  the  combined 
scores  of  slips  and  misspellings  be  counted,  the  corres- 
pondence between  total  errors  in  the  composition  test  and 
scores  in  the  list  test  is  slightly  greater  than  the  corres- 
pondence between  the  total  errors  in  the  composition 
test  and  scores  in  the  dictation  test.  If  the  limit  of 
variation  be  restricted  to  half  a  unit  of  variability,  these 
relations  are  altered  very  slightly.  On  the  whole,  there- 
fore, judgments  as  to  the  way  children  will  spell  in  their 
compositions,  based  upon  either  a  list  or  dictation  test, 
are  fairly  reliable. 

Those  unfamiliar  with  statistical  methods  will  find 
the  graphic  record  shown  in  Figure  22  on  page  135  a  more 
satisfactory  basis  for  judging  of  the  relation  between  the 
three  sets  of  scores.    While  in  general  the  correspondence 


1The  Pearson  Coefficient  of  Correlation  for  the  relation  between  mis- 
spellings and  scores  in  the  dictation  test  for  these  same  42  children  is  .67 
(probable  error  .06). 
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CozmcxENTS  or  Correspondence  dh  Ddtferent  Trials  or 
Spelling  Tests* 

CLASS 

INDIVIDUAL 
DEVIATION 

rKOlC  AVERAGE 

TOTAL  RANGE 
OF  SCORE 

S  -  Slip*. 

M  —  Misspellings.  .... 
T  -  Total  Mistakes. . . 

L  -LbtTest 

D  -  Dictation  Test. . . . 

5 
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20 
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CozmcTtms  or  Correspondence 

Percentage  of  Total  Cases  Which  Do  Not  Vaiy  in  Relative  Position, 
in  the  Two  Distributions  Compared,  More  Than]  One  (or  one  half) 
Unit  of  Variability. 
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This  table  is  to  be  read  as  follows:  If  the  relation  of  the  scores  of 
individual  children  in  number  of  slips  made  in  their  compositions  to 
the  median  number  made  by  the  class  as  a  whole  be  compared  with  the 
relation  of  the  same  individual  scores  in  misspellings  to  the  median 
number  of  misspellings  made  by  the  class  as  a  whole,  48  per  cent  of 
the  children  will  be  found  to  have  maintained  the  same  relative  position 
in  the  two  sets  of  scores  within  one  unit  of  variability;  that  is,  within 
5  slips,  or  14  n 
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between  scores  in  the  composition  test  and  the  score  in 
the  formal  spelling  test  is  relatively  close,  there  are 
among  these  42  individuals  some  5  or  6  who  do  better 
in  their  compositions  than  their  scores  in  the  list  tests 
would  warrant,  and  about  the  same  number  who  do 
better  in  their  formal  tests  than  their  spelling  in  the 
compositions  would  warrant.  That  is,  spelling  ability 
is  specific,  not  general,  and  performance  in  any  one  test 
is  dependent  upon  so  many  factors  that  performance 
in  a  single  test  is  not  a  reliable  basis  from  which  to  judge 
of  an  individual's  performance  in  a  related  test.  How- 
ever, if  the  two  tests  are  closely  alike,  as  for  instance, 
list  and  dictation  tests,  the  inference  will  be  n.ore  reliable 
than  from  performance  in  one  test  to  performance  in  a 
test  of  totally  different  character,  as  from  performance 
in  a  list  test  to  performance  in  a  composition  test. 

The  list  test  and  the  dictation  test  upon  which  these 
computations  were  based  (Table  XXX,  page  133)  were  of 
very  different  degrees  of  difficulty,  one  being  composed  of 
words  which  were  easy  for  the  grade,  and  the  other  of 
words  which  were  difficult  for  the  grade.  Search  through 
the  records  of  various  classes  brought  to  light  one  class 
in  which  the  records  in  the  dictation  test  and  the  list  test 
were  closely  the  same.  This  was  class  No.  n,  Emerson, 
rated  as  a  fourth  grade  class  in  June,  1916.  The  average 
score  in  the  list  test  was  55  per  cent,  and  in  the  dictation 
test  6 1  per  cent.  The  coefficients  of  correspondence  based 
on  the  scores  of  this  class  were  also  computed  (Table 
XXXI,  page  138).    In  this  case  the  relations  shown  in  the 
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Comparative  Abilities  in  List,  Dictation  and  Composition 

Tests 
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The  scale  along  the  base  of  the  figure  represents  individuals— 42 
eighth  grade  pupils  in  all.  The  scale  along  the  vertical  axis  represents 
the  position  of  the  individuals  in  the  distribution.  The  line  marked  zero 
represents  the  class  median.  The  lines  marked  1,  2, 3, 4,  and  5  above  or 
below  the  median  line  represent  differences  from  the  median  equal  to  one 
unit  of  variability  (in  this  case,  the  median  deviation  of  individual  scores 
from  the  class  median). 

For  each  individual  the  score  made  in  each  of  the  three  tests  is  indicated 
by  the  lines.  Individuals  are  arranged  in  order  of  their  performance  in 
the  composition  test;  that  is,  individuals  1  to  8  had  no  mistakes  in  spell- 
ing in  their  compositions,  individual  42  had  87  mistakes  in  spelling  per 
hundred  words  written  in  his  composition. 

The  broken  line  is  based  upon  accuracy  of  spelling  in  the  list  test 
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Figuke  22 — Continued 

The  scores  range  from  individual  20,  who  missed  none  of  the  20  words,  to 
that  of  individual  41 ,  who  missed  1 9  of  the  20  words. 

The  dotted  line  represents  the  scores  made  in  the  dictation  test 
The  scores  range  from  those  of  individual  16,  who  missed  none  of  the 
words,  to  individual  39,  who  missed  19  of  the  20  words. 

It  is  possible  to  tell  from  the  figure  for  any  one  individual  his  relative 
position  in  the  class  for  each  of  the  three  tests:  thus,  individual  n  stood 
very  high  in  the  list  test,  not  quite  so  high  in  the  dictation  test,  and  still 
lower  in  the  composition  test,  but  in  all  three  he  was  among  the  upper 
25  per  cent,  of  the  class.  Individual  7,  however,  stood  at  the  top  of  the 
class  in  the  composition  test,  a  little  above  the  median  in  the  list  test, 
and  in  the  lowest  25  per  cent,  in  the  dictation  test 

The  curves  show  that  about  75  per  cent  of  the  individuals  maintain 
the  same  relative  position  in  the  class  distributions  for  the  list  and  dicta- 
tion tests  that  they  do  in  the  composition  test  (within  one  unit  of  varia- 
bility). The  figure  shows  also  that  the  correspondence  between  the 
list  and  dictation  test  is  closer  than  between  these  two  formal  spelling 
tests  and  the  spelling  in  the  composition  test  Eighty  eight  per  cent  of 
the  individuals  maintain  the  same  relative  position  in  the  list  and  dicta- 
tion tests  within  one  unit  of  variability. 


previous  table  are  reversed.  It  is  the  list  test  that  corre- 
sponds more  closely  with  the  scores  for  slips  in  the  spell- 
ings, while  the  dictation  test  corresponds  more  closely 
with  the  total  number  of  errors  made.  As  before,  the 
correspondence  between  the  list  and  dictation  scores  is 
greater  than  that  for  any  other  relation.  If  the  limit 
of  variation  is  reduced  to  half  the  median  deviation,  the 
magnitude  of  the  correspondence  of  course  decreases, 
but  the  general  relations  are  not  greatly  changed. 

The  fact  that  a  child  misses  certain  words  in  the 
formal  spelling  test  is  not  an  indication  that  he  will 
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misspell  them  in  his  compositions  if  he  uses  them.  On 
the  other  hand,  a  child  who  misses  words  in  either  the 
list  or  dictation  test  is  quite  often  the  child  who  makes 
many  slips  in  writing  a  composition  and  has  many  words 
improperly  spelled.  An  attempt  was  made  to  check 
these  conclusions  by  direct  evidence  from  the  three  spell- 
ing tests.  It  was  hoped  that  many  of  the  words  used 
in  the  formal  tests  would  also  be  used  spontaneously 
by  the  children  in  the  composition  test.  Careful  search, 
however,  brought  to  light  but  five  words  which  were  so 
used  (Table  XXXII,  page  139);  therefore  the  results  are 
too  few  to  warrant  any  conclusions  except  the  two  pre- 
viously stated,  namely:  (1)  that  measurement  of  spelling 
ability  is  a  much  more  difficult  thing  than  it  is  popularly 
supposed  to  be,  and  (2)  that  children,  as  a  rule,  use  in  com- 
positions only  those  words  which  they  know  how  to  spell. 

RELIABILITY    OF   CLASS   RESULTS 

The  preceding  discussions  have  had  for  their  purpose 
the  full  statement  in  regard  to  the  unreliability  of  the 
performance  of  an  individual  in  a  single  test  as  a  measure 
of  his  general  ability  in  spelling.  Teachers  and  principals 
unfamiliar  with  testing  work  are  often  amazed  to  see  a 
child  whom  they  have  been  accustomed  to  rate  as  their 
best  speller  make  a  low  score  in  the  formal  test,  while  a 
child  who  has  repeatedly  failed  in  all  school  work  in  spell- 
ing makes  a  high  score  in  the  same  test.  Their  amaze- 
ment, due  to  the  apparent  contradiction  between  the  test 
results  and  their  judgments,  has  in  many  instances  led 
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TABLE  XXXI 
Relation  Between  Results  of  Diffexent  Spelling  Tests1 
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This  table  is  to  be  read  as  follows:  If  the  relation  of  the  scores  of 
the  individual  children  in  number  of  slips  made  in  their  compositions, 
to  the  median  number  made  by  the  class  as  a  whole,  be  compared  with 
the  relation  of  the  same  individual  scores  in  misspellings  to  the  median 
number  of  misspellings  made  by  the  class  as  a  whole,  53  per  cent,  of 
the  children  will  be  found  to  have  maintained  the  same  relative  position 
in  the  two  sets  of  scores  within  one  unit  of  variability;  that  is,  within 
nine  slips,  or  sixteen  misspellings. 
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to  the  conclusion  that  because  the  tests  are  unreliable 
measures  in  the  cases  of  a  few  individuals  they  are  un- 
reliable measures  of  the  class  as  a  whole.  Nothing 
could  be  further  from  the  truth.  The  factors  which  pro- 
duce variation  in  the  spelling  performances  of  individual 
children  probably  in  the  whole  cancel  each  other  within 
one  group,  so  that  if  the  child  of  greatest  spelling  ability 
who  ought  to  make  the  highest  score  in  the  test  happens, 
on  that  particular  day  because  of  a  severe  headache  or 
other  disturbing  factor,  to  make  a  low  score,  his  place  is 
taken  by  some  other  individual  whose  ability  is  relatively 
lower  but  who,  under  the  stimulus  of  the  special  occasion 
and  the  peculiar  conditions,  makes  a  score  much  above 
that  which  represents  his  median  performance.  As 
a  result,  the  scores  of  classes  tested  with  due  regard  for 
the  control  of  testing  conditions  are  very  reliable,  and 
the  results  of  surveys,  such  as  the  present,  may  be  de* 
pended  upon  to  reflect  the  true  conditions. 

The  extent  to  which  these  statements  are  true  £s  shown 
by  the  relative  standing  of  the  31  classes  in  the  Froebel 
school  in  the  three  spelling  tests  (Table  XXXIII,  page  140, 
Figure  23,  page  143) .  As  the  three  tests  have  very  differ- 
ent scores,  so  that  direct  comparison  of  score  with  score  is 
not  possible,  the  expedient  has  been  adopted  of  referring 
each  class  score  to  the  corresponding  city  wide  general 
value.  The  generalized  city  scores  express  the  general 
development  of  ability  throughout  the  Gary  system. 
The  difference  between  each  class  score  and  this  general 
value  shows  the  class  standing.    That  is,  a  class  that  is 
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Figure  23 

Class  Scores  in  the  List,  Dictation,  and  Composition 

Spelling  Tests 

RELIABILITY   OF  CLASS  SCORES 

—   LI5T DICTATION  -  -  COMPOSITION' 


•  £94.5*7*91011  Kl3l+l5ltl7l6l9;oattfW4-*5/62r/8aj03<lt 

The  scale  along  the  base  of  the  figure  represents  various  classes  from  the 
second  to  the  twelfth  grade  in  the  Froebel  school.  The  scale  along  the 
vertical  axis  represents  differences  above  and  below  the  city  wide  general- 
ized score. 

The  solid  line  is  based  upon  results  from  the  list  test;  the  broken  line 
from  the  dictation  test,  the  dotted  line  from  the  composition  test, 
"  Class  No.  13  in  all  three  tests  made  exactly  the  score  chosen  to  repre- 
sent the  city  wide  score  for  the  grade.  Class  No.  20,  however,  made 
a  score  approximately  10  per  cent,  below  the  median  in  both  list  and 
dictation  tests,  but  was  very  very  low  in  the  composition  test. 

As  a  whole,  the  curves  show  that,  with  a  few  exceptions,  the  relative 
spelling  ability  of  a  class  is  correctly  indicated  by  any  one  of  the  three 
teats. 


^laiiuy,  nowever,  the  general  stan 
in  each  of  the  three  tests. 

The  correspondence  between  tl 
tests  is  close  in  spite  of  the  fact  t 
scores  given  in  the  table  are  gross 
by  all  those  factors  of  variation 
discussed  in  Chapter  VIII.    Classes 
in  the  graph  show  the  only  marked 
general  correspondence  of  the  (Men 
tion  tests  with  the  other  two  sets  is  a 
for  several  classes  the  disagreement 
particularly  true  of  the  seventh  and  e 
No.  20,  which  shows  the  largest  diver 
fully  checked  on  the  basis  of  the  sec 
were  present  for  three  trials.    The  c 
materially  altered  by  the  differences 
the  other  hand,  the  relative  position  c 
in  the  class  was  closely  the  same 
The  coefficient  of  correspondence  fo 
tion  tests  was  90  per  cent.;  for  the  1 
tests,  80  per  cent.;  for  the  dictatic 
tests,  70  per  cent.    In  other  words 
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ticular  class  made  a  larger  number  of  spelling  errors  in 
their  compositions  than  their  standing  in  the  list  and 
dictation  tests  would  warrant,  the  effect  is  due  to  some 
peculiarity  of  their  training,  either  in  composition  or 
ting,  and  not  to  lack  of  correspondence  between  the 
tests  themselves.  From  the  results  as  a  whole, 
it  is  possible  to  say  that,  with  one  or  two  exceptions,  a 
class  which  is  shown  to  be  high  by  any  one  of  the  spelling 
tests  will  be  correspondingly  high  in  each  of  the  other 
two,  and  vice  versa. 

CONCLUSIONS 

As  the  conclusions  of  this  survey  in  regard  to  the 
spelling  ability  of  the  Gary  children  have  been  reached 
only  after  a  careful  study  of  the  full  data  from  three  spell- 
ing tests,  which  differ  markedly  in  character  and  in  the 
conditions  under  which  they  were  given,  and  as  the  re- 
sults of  the  three  tests  agree  in  their  general  implications, 
it  seems  to  the  writer  probable  that  the  conclusions  would 
not  be  changed  were  the  number  of  test  words  to  be 
largely  increased. 
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A  RITHMETIC  still  holds,  i 
/-\      prominent  place  as  one  > 
•*-    -^  both  a  liberal  and  a  voc 
the  elementary  grades  of  Gary  i 
958  hours  of  classroom  time  as 
hours   in   the   conventional   scho 
allotments  are  the  same,  being  19 
time  devoted  to  the  fundamentals. 

PRODUCTS  MEASUI 

The  products  of  training  in  arith 
of  varying  complexity.  They  rang 
skills  as  addition  and  multiplicatio 
products  as  ability  to  reason  in 
Measurement  of  the  simple  skills  is 
but  just  what  constitutes  a  legi 
reasoning  problem  at  each  stage  of 
not  yet  been  determined.  Accordin 
reasoning  tests  were  given  at  Gary. 

The  mechanical  skills  in  arithmeti 
urement  at  Gary  were  addition.  suK 
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tion,  and  division  of  whole  numbers  and  fractions. 
These  abilities  are  at  least  fundamental  for  all  arithmet- 
ical work,  both  in  school  and  in  later  life. 

TESTS  USED 

The  actual  tests  used  at  Gary  were  selections  from  the 
arithmetic  tests  in  the  Cleveland  Survey,  and  the  Courtis 
Standard  Research  Tests,  Series  B.  The  Series  B  tests 
were  given  twice,  about  five  weeks  apart,  April  25  and 
May  31. 

RESULTS  FROM  SERIES  B 

Half  of  the  eighth  grade  children  were  able  to 
work  but  8.3  or  less  of  the  addition  examples  in  the 
Series  B  tests  in  8  minutes  and  at  this  rate  the  median 
accuracy  of  work  was  55  per  cent.  In  subtraction  the 
median  eighth  grade  score  rose  to  9.0  examples  and  72 
per  cent,  accuracy;  in  multiplication  the  eighth  grade  rate 
was  8.4  examples,  accuracy  67  per  cent. ;  in  division  the 
eighth  grade  median  was  6.7  examples,  the  accuracy  74 
per  cent.  (Table  XXXIV,  page  148).  The  results  for 
addition  are  shown  graphically  in  Figure  24,  page  150. 
In  all  four  operations  there  are  small  regular  increases  in 
both  rate  of  work  and  in  accuracy  throughout  the  ele- 
mentary grades;  a  growth  that  for  three  operations 
continues  through  the  high  school  years  as  well.  In 
multiplication,  however,  the  development  increases  very 
little  in  accuracy  beyond  the  eighth  grade  level.  In 
addition  and  multiplication  the  twelfth  grade  accuracy 
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Table  XXXIV — Continued 

This  table  is  to  be  read  as  follows:    The  fourth  grade  in  Gary  in 

the  time  allowed  was  able  to  work,  in  addition,  4.3  examples  like  the 

sample  shown  with  an  accuracy  of  36  per  cent.    That  is,  of  the 

examples  tried,  about  one  third  were  right;  in  subtraction  the  amount 

done  was  5.0  examples  and  the  accuracy  of  work  was  41  per  cent.;  in 

multiplication  the  rate  was  34  examples,  the  accuracy  37  per  cent.;  in 

division  the  rate  was  2.4  examples,  accuracy  24  per  cent. 

The  average  differences  of  generalized  from  actual  city  wide  median 
scores  are  as  follows:  Addition:  rate — .2,  accuracy — 24.  Subtraction: 
rmte — .3,  accuracy — 2.8.  Multiplication:  rate — 4,  accuracy— 3.5. 
Division:  rate— .4,  accuracy— 5.0. 

scores  do  not  rise  above  the  70  per  cent,  level;  the  high- 
est twelfth  grade  accuracy  (division)  is  86  per  cent. 

COMPARATIVE  DATA1 

It  is  perhaps  unfair  to  judge  of  achievements  in  terms 
of  results  alone.  Comparative  data  are  essential  to 
throw  light  upon  two  important  questions:  (1)  What 
degrees  of  skill  are  needed  for  business  life?  and  (2) 
what  are  the  achievements  of  children  in  other  schools?1 

The  same  tests  as  those  given  at  Gary  have  been  given 
under  the  same  general  conditions  to  adults  in  various 
walks  of  life.'  The  scores  in  addition  vary  from  2.9 
examples  and  31  per  cent,  accuracy  made  by  the  lowest 
paid  laborers  in  a  large  manufacturing  establishment,  to 
scores  of  19. 1  examples  and  89  per  cent,  accuracy  made 

,See  page  38. 

•See  Bulletin  No.  4,  Courtis  Standard  Research  Tests. 

•See  Fourteenth  Yearbook  of  the  National  Society  for  the  Sfcufy  of 
Education. 
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by  successful  business  men  (Table  XXXV,  page  152). 
Thus  there  is  evidence  that  the  tests  measure  skills  of 
value  in  life  and  that  rates  of  12  examples  or  more,  and 
accuracies  of  80  per  cent,  and  higher,  are  necessary  in 
many  forms  of  commercial  activity.  The  Gary  eighth 
grade  product  measured  by  such  standards  is  very  low 
in  rate  of  work,  and  inadequate  in  accuracy.1 

Much  more  significant  are  comparisons  with  the 
achievements  of  children  in  conventional  schools.  In 
addition  a  score  of  11.6  examples  attempted  (Gary  8.4) 
and  accuracy  76  per  cent.  (Gary  57)  may  be  taken  as  the 
average  achievement  of  American  schools  in  northern 
states  (Tables  XXXVI  and  XXXVI-A,  pages  153  and 
154).  These  "norms"  are  derived  from  tests  given  in 
May  and  June,  1916,  in  cities  of  every  type,  large  and 
small,  and  in  widely  separated  states.  Comparison  with 
scores  from  large  cities  (Boston  13.7  examples,  accuracy 
78  per  cent.)  would  make  the  Gary  results  seem  corre- 
spondingly lower,  and  with  lower  scores  from  the  smaller 
cities,  or  from  rural  schools  (a  county  in  Pennsylvania, 
7.7  examples — 52  per  cent.)  correspondingly  higher. 
Judging  from  all  comparative  data  available,  however, 
it  is  quite  plain  that  Gary  is  low  both  in  rate  of  work  and 
in  accuracy. 

The  Gary  scores  in  addition,  plotted  in  relation  to  the 
median  development  curve  based  upon  the  scores  adopted 
as    representing    average    conditions    throughout    the 

1See  also  Hanus  &  Gaylord,  Educational  Administration  and  Super- 
November,  1917. 
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TABLE  XXXVI-B 
Comparative  Data,  Series  B 
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SCORES  OF  GARY  EIGHTH  GRADE  CLASSES1 

CLASS 

1st  trial 

2d  trial 

SCHOOL 

1 

NUMBER 

ACCU- 

ACCU- 

BATE 

RACY 

RATE 

RACY 

Addition. 

Froebel 

46 

8.7 

60 

9.0 

66 

it 

46 

8.0 

68 

8.7 

64 

14 

8.0 

60 

8.0 

64 

« 

16 

7.0 

63 

7.2 

68 

18 

10.3 

64 

9.6 

60 

Subtraction, 

Froebel 

46 

9.0 

76 

8.6 

60 

« 

46 

9.7 

67 

9.6 

72 

14 

9.2 

72 

9.6 

64 

«« 

16 

8.6 

80 

8.6 

68 

Jefferson 

18 

11.0 

92 

10.4 

81 

Multiplication. 
Froebel. 

46 

7.3 

60 

8.6 

71 

« 

46 

7.3 

68 

9.6 

82 

14 

8.2 

71 

8.7 

71 

« 

16 

7.0 

66 

7.2 

68 

18 

9.7 

71 

9.7 

78 

Division. 

46 

7.7 

70 

7.6 

86 

« 

46 

8.2 

83 

7.4 

80 

Emerson 

14 

6.3 

57 

6.6 

88 

« 

16 

52 

78 

6.2 

76 

Jefferson 

18 

9.2 

81 

10.0 

96 

>Sec  page  104. 

United  States,  make  evident  both  a  difference  in  the  gen- 
eral character  of  the  development  at  Gary,  and  the  fact 
that  the  scores  made  by  the  Gary  children  are  relatively 
very  low.  The  Gary  curve  is  concave;  the  curve  from 
the  general  results  convex  (Figure  25,  page  157).  This 
difference  probably  means  that  in  the  conventional  schools 
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SCORES  OF  GARY  EIGHTH  GRADE  CLASSES1 

CLASS 

1st  trial 

2d  trial 

SCHOOL 

NUMBER 

ACCU- 

ACCU- 

BATE 

RACY 

RATE 

RACY 

Addition. 

Froebcl 

46 

8.7 

60 

9.0 

66 

«< 

46 

8.0 

68 

8.7 

64 

14 

8.0 

60 

8.0 

64 

«< 

16 

7.0 

63 

7.2 

68 

Jefferson 

18 

10.8 

64 

9.6 

60 

Subtraction, 

Froebcl 

46 

9.0 

76 

8.6 

60 

<« 

46 

9.7 

67 

9.6 

72 

14 

9.2 

72 

9.6 

64 

« 

16 

8.6 

80 

8.6 

68 

Jefferson 

18 

11.0 

92 

10.4 

81 

Multiplication. 
FroebeT. 

46 

7.3 

60 

8.6 

71 

« 

46 

7.3 

68 

9.6 

82 

14 

8.2 

71 

8.7 

71 

« 

16 

7.0 

66 

7.2 

68 

18 

9.7 

71 

9.7 

78 

Division. 

Froebcl 

46 

7.7 

70 

7.6 

86 

« 

46 

8.2 

83 

7.4 

80 

Emerson 

14 

6.3 

57 

6.6 

88 

<« 

16 

6.2 

78 

6.2 

75 

Jefferson 

18 

9.2 

81 

10.0 

95 

■See  pafe  194- 

United  States,  make  evident  both  a  difference  in  the  gen- 
eral character  of  the  development  at  Gary,  and  the  fact 
that  the  scores  made  by  the  Gary  children  are  relatively 
very  low.  The  Gary  curve  is  concave;  the  curve  from 
the  general  results  convex  (Figure  25,  page  157).  This 
difference  probably  means  that  in  the  conventional  schools 
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most  of  the  children  have  learned  to  add  by  the  end  of  the 
fourth  grade,  and  in  the  remaining  grades  there  are  small 
improvements  in  both  rate  and  accuracy  of  work,  due 
partly  to  increasing  maturity,  partly  to  elimination 
of  the  less  able  through  non-promotion,  dropping  out  of 
school,  etc.,  and  partly  to  the  effects  of  training  upon 
those  who  have  not  learned  in  previous  grades.    In 
Gary,  however,  progress  in  the  lower  grades  is  quite 
uniform  in  both  rate  and  accuracy,  being  mainly  in 
rate  in  the  lower  grades,  and  evenly  balanced  between 
rate  and  accuracy  in  the  high  school  years.    The  level 
of  work  is,  however,  very  low — so  low  that  one  is  led  to 
wonder  how  much  of  the  progress  is  due  to  training,  and 
how  much  merely  due  to  the  effects  of  maturity  and 
elimination.1    For  example,  the  twelfth  grade  scores  in 
both  rate  and  accuracy  do  not  reach  those  of  the  seventh 
grade  in  the  conventional  school,  and  the  eighth  grade 
score    at    Gary   is    only   slightly   above    the   normal 
fourth  grade  level  in  rate  and  far  below  it  in  accu- 
racy. 

The  median  development  curve  in  Figure  25  is  based 
upon  results  from  cities  of  every  type,  large  and  small, 
and  it  is  hardly  fair  to  compare  the  Gary  results  with 
those  from  larger  cities  which  are  known  to  do  better  in 
the  fundamental  subjects  than  small  villages  and  towns. 
However,  a  comparison  of  the  Gary  scores  with  those 
from  smaller  cities  does  not  alter  the  general  character 
of  the  conclusions  to  be  drawn  (Figure  26,  page  159). 

!See  Chapter  VIII,  page  357. 
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Figote  25 
Deveuhwxht  or  Rate  and  Acposacy  in 
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The  scale  along  the  top  of  the  figure  represents  rite,  or  the  number  of 
examples  attempted.  The  scale  along  the  left  hand  aide  of  the  figure 
represents  ratio  of  examples  right  to  examples  attempted,  or  the  accuracy 
or  work  expressed  in  per  cent.  Each  point  of  the  diagram,  therefore, 
represents  two  scores— rite  and  accuracy.  The  position  of  the  circle 
marked  "4"  on  the  general  curve  (broken  line)  represents  a  rate  of  7.4 
examples  attempted  and  64  per  cent,  accuracy. 

The  curve  for  the  Gary  results  is  shown  by  the  heavy  line.  The  cir- 
cles show  the  position  of  the  different  grade  scores. 

The  twelfth  grade  score  in  rate  falls  between  the  sixth  and  seventh 
grade  score  on  the  general  curve,  and  slightly  below  the  fifth  grade 
accuracy.  The  eighth  grade  Gary  results  are  not  quite  equal  to  the 
general  fifth  grade  scores  in  rate,  and  very  much  lower  than  the  fourth 
grade  accuracy. 

The  portion  of  the  general  curve  below  the  fourth  grade  is  not  very 
reliable,  but  the  difference  between  the  general  character  of  the  Gary 

'Based  on  results  at  Gary  and  results  from  thousands  of  children  fit 
cities  of  many  different  types. 
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Figure  2  $— Continued 

curve  and  that  of  the  general  curve  is  marked.  The  general  curve  in- 
dicates that  the  development  of  skill  in  addition  is,  in  the  conventional 
school,  nearly  completed  by  the  end  of  grade  five,  while  the  Gary  curve 
shows  that  there  is  a  very  small,  but  regular,  increase  in  rate  and 
accuracy  from  grade  to  grade  up  to  the  end  of  the  high  school  years. 
In  high  school  years  there  is  no  direct  training  for  the  development  of 
skill  in  addition,  so  the  progress  from  grade  to  grade  must  represent 
either  incidental  training,  or  the  effect  of  the  elimination  of  the  less  able 
by  non-promotion.  Therefore,  the  Gary  curve  as  a  whole  would  seem  to 
indicate  that  growth  in  skill  in  addition  in  all  grades  is  due  mainly  to  the 
same  causes,  and  very  little  to  direct  training. 

All  the  Gary  curves  fall  much  below  those  of  conventional 
schools.  For  all  operations  the  twelfth  grade  scores 
in  accuracy  at  Gary  do  not  attain  to  the  conventional 
levels  of  the  eighth  grade  accuracy,  and  for  all  except 
multiplication  the  rate  of  work  is  also  lower.  The  eighth 
grade  results  in  rate  are  about  equal  to  those  of  the 
conventional  fifth  grade,  and  in  accuracy  are  lower.  In 
division  only  do  the  eighth  grade  scores  in  accuracy 
much  exceed  those  of  the  fourth  grade  in  the  conventional 
school. 

It  is  possible  to  find  schools  with  lower  records  than 
those  at  Gary,  but  they  are  not  common.  For  instance, 
in  a  bulletin  issued  by  the  University  of  Iowa,  the  scores 
made  by  the  various  Iowa  schools  in  the  same  tests 
used  at  Gary  are  grouped  according  to  the  size  of  the 
town  from  which  they  came.  The  towns  vary  in  size 
from  those  of  less  than  one  thousand  population  to  more 
than  ten  thousand  population,  the  class  in  which  Gary 
would  fall.    Out  of  848  comparisons  of  scores  from  towns 
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The  curves  are  drawn  in  precisely  the  same  fashion  as  in  Figure  as.  hut 
all  scale*  have  been  omitted  in  order  to  bring  the  four  curves  together 
in  one  figure.  The  reader  need  only  remember  that  displacement  to  the 
right  means  greater  rate,  and  displacement  toward  tic  top  of  the  diagram 
means  greater  accuracy.  All  circles  represent  scores  in  both  rate  and 
accuracy.  The  grades  are  indicated  by  the  small  figures  near  the 
circles.  The  solid  line  represents  Gary  scores.  The  broken  line  represents 
results  from  small  cities. 

The  reader  should  note  that  in  all  tests  the  eighth  grade  scores  are 
low  in  both  rate  and  accuracy.  Except  for  multiplication  the  rate  records 
of  the  twelfth  grade  never  exceed  those  of  the  eighth  grade  in  the  con- 
ventional school,  and  in  no  case  do  twelfth  grade  accuracies  attain  to 
the  level  of  accuracy  of  the  eighth  grade  in  conventional  schools. 

'Baaed  on  Gary  results  and  on  results  of  tests  given  in  smaller  cities, 
May  and  June,  19 16. 


160  THE  GARY  SCHOOLS 

and  villages  of  iooo  population  or  under,  but  82  reported 
scores  as  low  as  or  lower  than  Gary.  This  was  approxi- 
mately 10  per  cent,  of  the  total  scores.  For  the  next 
larger  size  city,  the  proportion  was  but  7  per  cent.,  and 
for  the  cities  of  more  than  10,000  population,  about  one 
half  of  1  per  cent.  How  much  allowance  is  to  be  made  for 
possible  differences  in  the  care  with  which  the  tests  were 
given  and  scored  is  not  known,  but  the  instructions  for 
giving  the  tests  are  well  standardized,  the  procedure  simple 
and  the  results  consistent.  It  would  seem,  therefore,  that 
beyond  question  the  Gary  results  are  exceptionally  low.1 

CLEVELAND  ARITHMETIC  TESTS2 

Results  from  the  Cleveland  Arithmetic  Tests  will 
now  be  considered.  The  Cleveland  tests,  it  will  be 
remembered,  are  diagnostic  tests.  That  is,  they  are 
graded  in  difficulty  for  any  one  operation,  so  that  by 
making  comparisons  on  the  basis  of  several  tests  at  once 
it  is  possible  to  judge  the  effects  produced  by  teaching 
effort.  The  portions  of  the  Cleveland  Tests  used  at  Gary 
were  those  dealing  with  multiplication  and  fractions.  The 
results  will  be  discussed  separately. 

The  Gary  scores  in  these  tests  confirm  the  conclusions 
from  the  comparative  data  previously  shown.  The  Gary 
eighth  grade  scores  are  but  70  per  cent,  of  the  COrre- 


H^ut  of  160  comparisons  of  Gary  eighth  grade  scores  with  correspond- 
ing scores  of  20  cities  in  Indiana,  in  but  12  cases  were  their  scores  as  low 
as  or  lower  than  the  Gary  scores. 

•See  page  39. 
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raiding  Cleveland  scores  in  the  simplest  test,  82  per 
at.  in  the  more  complex  test,  and  87  per  cent,  in  the 
>st  difficult  multiplication  test  (Table  XXXVII). 
lat  the  Gary  scores  are  relatively  lower  in  the  simple 
>ts  than  in  the  complex  tests  is  favorable  to  the  Gary 
lining,  as  it  means  that  the  Gary  children  use  their 
towledge  of  the  tables  more  efficiently  than  the  Cleve- 
ld  children,  but  this  favorable  aspect  of  the  results 
ould  not  lead  the  reader  to  forget  that  in  all  grades 
d  tests  the  Gary  scores  are  much  below  those  of 
eveland.1 

The  curves  for  the  development  of  the  different  abilities 
Gary  are  concave,  while  the  Cleveland  scores  are 
nvex  (Figure  27,  page  163) .  This  probably  means  that 
e  development  of  ability  begins,  on  the  average,  two  or 
ree  years  later  in  Gary  than  in  Cleveland,  so  that  the 
eveland  curves  approach  their  maxima  before  the  Gary 
rves  start  to  rise.  The  Gary  results  reach  the  Cleveland 
jhth  grade  levels  only  in  the  late  high  school  grades. 
The  significant  facts  of  the  comparison  between  Cleve- 
ld  and  Gary  scores  are  much  more  readily  compre- 
nded  if  the  comparison  is  put  in  the  form  of  Figure  28, 
ge  164.  One  can  see  at  a  glance  that  the  progress  at 
eveland  between  grades  three  and  four  is  more  than 
at  made  in  grades  four,  five,  or  six,  in  Gary.  Similarly, 
e  progress  in  grades  four  and  five  in  Cleveland  is 
ughly  equal  to  that  in  grades  seven  and  eight  in  Gary. 

'Comparison  with  similar  data  from  the  Grand  Rapids  Survey  proves 
it  the  Gary  scores  are  also  much  lower  than  those  from  that  city. 
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The  solid  Unci  represent  Gary;  the  broken  lines,  Cleveland.  "A" 
represent!  results  from  the  test  of  multiplication  tables.  "B"  represent! 
a  test  of  one  place  multiplication.  "C"  represents  two  place  multi- 
plication.   For  types  of  examples  see  text. 

The  scale  along  the  base  of  the  figure  represents  grades.  The  scale 
at  the  left  of  the  figure  represents  number  of  examples  correctly  worked. 

Gary  curves  are  concave)  Cleveland  curves  are  convex,  indicating  that 
development  of  these  abilities  takes  place  at  Gary  two  or  three  years 
later  than  at  Cleveland. 

The  Cleveland  results  are  expressed  in  number  of 
examples  worked  correctly  in  a  given  time.  As  ex- 
plained elsewhere,1  the  author  considers  that  such  records 
express  but  a  very  small  part  of  the  meaning  of  the  re- 

■Sm  page  aoS. 
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FlGUKE  28 

Rates  of  Development  in  Multiplication  Tests 

COMPARISON-  CLEVELAND  -  GARY      MULTIPLICATION 

GARY  GRADES 
*       4       5        4       7       a       t       10      11       it 
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In  the  diagrams  the  vertical  lines  represent  the  median  grade  scores; 
the  horizontal  spaces  represent  the  multiplication  tests. 

In  the  upper  diagram  the  vertical  lines  are  based  upon  the  grade  scores 
at  Gary.  In  the  lower  diagram  the  vertical  lines  are  based  upon  the 
grade  scores  in  Cleveland.  In  the  upper  diagram  the  dotted  lines  ire 
drawn  in  the  proper  relative  position  to  represent  the  scores  at  Cleveland, 
while  in  the  lower  figure  it  is  the  Gary  scores  that  are  represented  by  the 
dotted  lines. 

From  either  the  upper  or  lower  figure  it  will  be  seen  that  grades  four, 
five,  and  six  at  Gary  fall  between  the  third  and  fourth  grade  curves  for 
Cleveland. 

suits;  that  for  a  complete  understanding  of  the  signifi- 
cance of  the  data  they  must  be  expressed  in  terms  of 
rate  and  accuracy  of  work.  Accordingly,  the  results 
of  the  Cleveland  Tests  have  been  thus  tabulated,  and  to 
make  the  data  from  different  tests  easily  comparable 
(that  they  might  be  shown  in  one  graph)  each  rate  score 
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so  been  expressed  as  a  percentage  of  the  score 
by  the  twelfth  grade  (Table  XXXVIII,  page  166). 
results  in  this  form  show  that  development  in  the 

is  practically  completed  by  the  sixth  grade 
acy  91  per  cent.) .    The  development  of  the  ability 

a  one  place  multiplier,  however,  increases  more 
,  and  does  not  approximate  its  maximum  develop- 
until  the  seventh  grade  (accuracy  84  per  cent.). 
>pment  of  ability  to  use  a  two  place  multiplier  is 

a  different  character.  Increases  are  fairly  regu- 
d  equal  from  grade  to  grade,  up  to  the  eighth 

but  from  this  point  on  progress  in  accuracy 

and  the  curve  indicates  increase  in  rate  only 
e  29,  page  167). 

ther  words,  in  the  simplest  work  the  development 
pleted  early  in  school  life.    The  more  complex  is 

completed  by  the  eighth  grade.  In  the  most 
It  work  of  all,  the  development  shows  no  signs  of 
lg  a  maximum,  and  progress  is  merely  cut  off  at 
level  by  high  school  work  in  which  no  training 
Implication  is  provided.  In  Figure  29  the  dotted 
based  upon  the  Gary  results  in  the  Series  B  tests, 
tention  is  called  to  the  exactness  with  which  two 
leries  B,  multiplication  test  No.  3,  and  Cleveland 
Qultiplication,  set  L)  confirm  each  other.  The 
ag  of  such  comparisons  is  plain;  the  more  complex 
ility,  the  less  well  it  is  taught  at  Gary  as  measured 
ercomparisons  of  the  results  at  Gary  themselves, 
vidence  from  the  Cleveland  Tests  thus  greatly 
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Figure  29 
Rates  of  Development  in  Three  Multiplication  Tests 

*  DEVELOPMENT  CURVES     CLEVELAND  Tests  MULTIPLICATION 
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he  scale  along  the  horizontal  axis  represents  the  percentage  the  rate 
es  of  each  grade  are  of  the  rate  scores  made  by  the  twelfth  grade, 
scale  along  the  left  of  the  figure  represents  accuracy  of  work.  The 
e  solid  lines  represent  the  curves  for  the  three  multiplication  tests 
:h  differ  in  complexity,  ranging  from  a  very  simple  test  of  the  multi- 
ition  tables  up  to  long  multiplication  (four  place  numbers  multiplied 
two  place  numbers.)-  The  dotted  line  represents  the  results  from 
multiplication  test  in  Series  B.  In  all,  the  circles  show  the  position 
he  grade  scores  in  both  rate  and  accuracy.  The  curves  differ 
kedly  in  their  character;  that  for  long  multiplication  showing  no 
s  of  reaching  a  maximum.  The  development  is  merely  interrupted 
be  eighth  grade. 

s  a  whole,  the  set  of  curves  means  that  the  more  complex  the  ability 
less  well  it  is  taught. 
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strengthens  the  conclusions  drawn  from  a  comparison 
of  the  Gary  results  with  those  from  other  cities. 

The  clearness  with  which  such  diagnostic  tests»reveal 
the  story  of  what  is  taking  place  within  a  school  sys- 
tem is  strikingly  illustrated  in  the  case  of  the  fraction 
tests.    Fractions  represent,  of  course,  a  more  complex 
type  of  development  than  multiplication.    The  exam- 
ples in  the  tests  called  simple  fractions  were  all  of  the 
type  |+|  or  \ — \,  in  which  the  only  response  called 
for  was  the  addition  or  subtraction   of  the  numera- 
tors.   In  the  test  called  complex  fractions,  however, 
the  examples  involved  reduction  to  a  common  denomi- 
nator and  reduction   to  lowest  terms.    Further,  the 
complex  test  included  multiplication  and  division  of 
fractions,  as  well  as  addition  and  subtraction.    Tie 
examples  in  the  test  were  of  the  type  t  +  }.     Twenty 
one  was  the  largest  denominator  called  for  in  any  ex- 1 
ample  and  all  the  denominators  were  products  of  simple 
factors. 

The  character  of  the  development  revealed  by  the 
results  (Table  XXXIX,  page  169)  is  a  confirmation  of  the 
conclusions  of  the  previous  discussions.  For  the  test  in 
simple  fractions  the  increases  in  accuracy  from  grade  to 
grade  are  relatively  large  and  continue  up  to  the  tenth 
grade.  For  complex  fractions  the  period  of  rapid  increase 
does  not  begin  until  grade  seven,  and  from  that  point 
is  in  accuracy  only  (Figure  30,  page  171).  In  other 
words,  the  results  for  fractions  differ  from  those  for  mul- 
tiplication in  exactly  the  same  general  fashion  as  the 
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multiplication  results  differ  among  themselves.     The 
more  complex  the  ability,  the  less  the  development. 


SCHOOL  TO  SCHOOL  COMPARISONS 

A  comparison  of  the  scores  from  school  to  school 
reveals  rather  larger  differences  in  arithmetical  abilities 
than  in  those  discussed  in  the  previous  chapters.  The 
school  which  is  least  well  equipped  to  cany  out  a  modem 
program,  the  Beveridge,  shows  quite  uniformly  in  all 
tests  a  larger  number  of  scores  above  the  city  median 
than  the  other  schools.  Jefferson  is  second,  Froebel 
third,  and  Emerson  fourth  (Table  XL,  page  172).  That 
is,  the  Emerson  school  has,  proportionately,  a  larger 
number  of  low  scores  than  any  other  school.  In  all 
schools,  however,  there  are  individual  classes  which  have 
scores  above  the  city  median,  and  others  which  fall 
below  it. 

A  similar  school  to  school  comparison  based  upon  re- 
sults in  the  Cleveland  tests  gave  very  similar  results 
(Table  XLI,  page  173,  Figure  31,  page  174).  Neverthe- 
less, the  differences  from  school  to  school  are  relatively 
insignificant  and  probably  mean  merely  that  the  Beve- 
ridge school  gives  more  emphasis  to  the  drill  work.  In 
Emerson,  on  the  other  hand,  arithmetic  is,  in  general, 
receiving  less  attention  than  in  Jefferson  and  Froebel 
schools.  In  all  the  schools  the  very  best  classes  have 
scores  much  below  those  made  by  children  of  correspond- 
ing grades  in  other  cities. 
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Figure  30 
Raxes  op  Development  in  Two  Fraction  Tests 

DEVELOPMENT  CURVES    CLEVELAND  TESTS -FRACTIONS 
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The  scale  along  the  horizontal  axis  represents  the  percentage  the  rate 
scores  of  each  grade  are  of  the  rate  scores  made  by  the  twelfth  grade. 
Scale  along  the  left  of  the  figure  represents  accuracy  of  work.  The  two 
solid  lines  represent  the  curves  for  the  two  fraction  tests;  one,  the  addi- 
tion or  subtraction  of  fractions  having  the  same  denominator,  the  other, 
four  operations  with  fractions  having  unlike  denominators.  For  illus- 
tration of  the  type  of  examples  see  text.  In  both  curves  the  posi- 
tion of  the  figures  indicate  grade  scores  in  both  rate  and  accuracy. 

The  curve  for  the  simple  fraction  test  shows  a  smaller  rate  of  rise  in 
accuracy  than  the  curves  for  long  multiplication  in  the  previous  figure, 
and  for  the  reasons  there  explained.  The  curve  for  the  complex  test  in 
fractions  indicates  mere  growth  in  rate  up  to  the  seventh  grade,  and  from 
that  point  on,  growth  in  accuracy  only. 

As  a  whole,  the  two  curves  show  that  the  Gary  children  have  very 
little  ability  to  work  with  fractions. 
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MEASUREMENT  OF  REASONING  POWER 

It  may  be  contended  by  some  that  in  place  of  skill  in 
computation  the  Gary  children  are  receiving  a  type  of 
training  which  develops  reasoning  power  instead,  and 
makes  them  better  able  to  cope  with  arithmetical  situa- 
tions after  they  leave  school. 

The  answers  to  this  claim  are:  First,  no  evidence 
of  such  superior  ability  to  grapple  with  arithmetical 


TABLE  XLI 

School  to  School  Comparison-  Series  B— Two  Trials 

Tie  results  below  show  the  number  of  class  scores  in  each  school 

which  are  more  than  one  tenth  of  the  corresponding  city  wide  scores 

above  or  below  the  general  results  for  the  city  as  a  whole.    All  four 

operations  are  combined. 


SCHOOL 

NUMBER 
CLASS 

SCORES 
COMPARED 

NUMBER  OF  CLASS  SCORES  ABOVE  AND 
BELOW  CITY  WIDE  RESULTS 

zz. 

££r 

A^CT 

"%zr 

toebel 

eff  crson. .... 

176 
78 

104 
96 

34 
2 
37 
37 

27 
27 
27 
28 

60 
6 
80 
47 

46 

17 
26 
IS 

This  table  is  to  be  read  as  follows:  Out  of  176  class  scores  in  Froebel 
ompved  with  the  corresponding  city  wide  scores,  there  were  34  in 
ate  and  50  in  accuracy  markedly  above  the  median,  and  17  in  rate 
ind  46  in  accuracy  markedly  below  the  median.  That  is,  the  Froebel 
chool  ranks  were  slightly  above  the  general  results  for  the  city.  Simi- 
ariy,  the  Jefferson  school  is  slightly  higher  than  the  Froebel  school, 
Mt  lower  than  the  Beveridge  school,  and  much  above  the  Emerson 
(chool  in  the  abstract  work. 
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Figuss  31 
School  to  School  Comparison 

CLEVELAND     ARITHMETIC    TE5T 

SCHOOL    TO    SCHOOL    COMPARISONS 

HIGHEST   AND  LOWEST    FIFTH  GRADE  CLASSES 
GRADE    3  4.  J  6  7  6 
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EMEHSON      I        BEVERIDGE    B        CLEVELAND      C 

Comparison  of  scores  made  by  fifth  grade  classes  in  the  Beveridge 
(highest)  and  Emerson  (lowest)  schools  in  Gary  with  the  city  wide 
medians 

The  horizontal  spaces  represent  the  Cleveland  arithmetic  tests  in 
multiplication  and  fractions.  The  vertical  line  represents  the  Gary  city 
wide  grade  medians.  The  line  marked  "E"  represents  the  scores  from 
the  Emerson  school;  line  marked  "B" — the  scores  from  the  Beveridge 
school;  the  line  marked  "C" — Cleveland  fifth  grade  scores. 

The  fifth  grade  class  in  the  Beveridge  school  falls  about  one  year 
above  the  average  for  the  fifth  grade  city  wide  scores,  while  the  fifth 
grade  in  the  Emerson  school  falls  about  an  equal  distance  below.  The 
curve  for  the  average  scores  of  the  fifth  grade  in  Cleveland  was  repre- 
sented by  the  third  curve  except  in  the  complex  test  in  fractions,  in 
which  the  Cleveland  score  is  given  as  zero.  The  Cleveland  fifth  grade 
scores  fall  between  those  of  the  seventh  and  eighth  grades  at  Gary. 
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situations  was  discovered  in  the  course  of  the  survey, 
although  the  children  were  repeatedly  required  to 
score  their  papers,  and  to  perform  other  work  inciden- 
tal to  the  testing  which  required  an  intelligent  use  of 
mathematical  skills  as  a  means  to  an  end.  Second, 
however  well  a  person  may  be  able  to  reason,  his  work 
in  the  world  will  be  ineffective  if  he  does  not  have  the 
mechanical  skill  necessary  to  obtain  correct  results. 
In  educational  circles  there  are  some  who  claim  that  a 
child  will  develop  such  skill  as  need  arises,  provided  he 
has  a  motive  for  doing  his  work  correctly.  The  reader 
should  be  careful  to  note,  however:  First,  that  the  scores 
made  by  the  high  school  classes  show  very  small  increases 
in  ability  over  those  of  the  eighth  grade  in  spite  of  the 
fact  that  these  classes  have  more  or  less  incidental  training 
in  arithmetic  through  its  use  in  algebra,  physics,  chemis- 
try, etc. ;  and,  second,  that  there  is  no  evidence  of  any 
beneficial  transfer  from  the  incidental  use  of  arithmetic 
in  the  shop  work  and  activities  of  the  enriched  cur- 
riculum. 

There  should,  however,  be  no  confusion  in  the  mind 
of  the  reader  in  regard  to  this  point.  The  tests  used 
were  not  given  to  measure  reasoning  power.1  They 
prove  merely  that  the  Gary  children  do  not  possess  the 
ability  to  add,  subtract,  multiply,  and  divide  at  a  reason- 
able rate  and  with  reasonable  accuracy — defining  reason- 
able as  that  rate  and  accuracy  which  is  attained  by  the 
average  child  in  the  conventional  school. 

1Sec  Chapter  VII,  page  328. 


176  THE  GARY  SCHOOLS 

§2.    Critical  Discussion 

types  of  products 

Training  in  arithmetic  falls  sharply  into  two  divisions, 
(i)  arithmetical  computations  and  (2)  reasoning. 

The  products  of  the  first  type  of  training  are  mechanical 
skills  or  habits.  Training  involves  building  up  a  set  of 
responses  to  objective  stimuli.  The  stimuli  themselves, 
the  controlled  associations  called  forth  by  them,  and 
motor  responses,  are  the  elements  out  of  which  such  skills 
are  built.  The  products  of  training  have  two  funda- 
mental aspects:  speed,  or,  better,  rate,  the  amount  of 
work  done  per  unit  of  time;  and  accuracy,  or  the  re- 
lation  of  the  work  that  is  correct  to  the  total  work  done. 
Both  of  these  are  easily  measurable  in  objective  units. 

The  higher  thought  processes  of  the  second  type 
(reasoning)  are  much  more  complex,  and  the  products  of 
school  training  in  them  are  much  less  clearly  defined. 
Moreover,  as  in  testing  work,  all  reasoning  problems  must 
be  represented  through  printed  symbols,  the  actual  results 
obtained  in  a  reasoning  test  are  merely  unanalyzed  result- 
ants of  reasoning  ability  and  ability  in  reading.  In  view  of 
the  many  uncertainties  and  difficulties  connected  with  test- 
ing such  abilities,  and  interpreting  the  results,  it  was  de- 
cided to  limit  the  measurement  of  arithmetical  products  to 
the  fundamental  skills. 

TESTS  OF  SKILLS 

For  the  mechanical  skills  of  arithmetic,  well  standard- 
ized tests  and  standards  and  a  growing  volume  of  com- 


ARITHMETIC  177 

parative  data  are  available  for  interpretative  purposes. 
The  Courtis  Standard  Research  Tests,  Series  B,  measure 
the  end  products  of  training.  Certain  of  the  arithmetic 
tests  used  at  Cleveland,  namely,  those  dealing  with  the 
various  phases  of  multiplication  and  fractions,  trace  the 
relative  development  of  these  abilities.  The  tests,  as  a 
whole,  therefore,  show  plainly  the  nature  of  the  devel- 
opment and  the  character  of  the  product  of  the  classroom 
teaching  of  the  fundamental  skills. 

The  expression  "end  product"  needs  definition  and 
explanation.  In  multiplication,  for  example,  it  is  easy 
to  show  that  the  products  of  training  in  the  various  grades 
differ  greatly  in  complexity.  In  most  school  systems 
the  children  begin  development  of  skill  in  multiplication 
by  learning  the  multiplication  tables.  At  a  later  grade 
they  master  the  technique  of  carrying.  Soon  after  that 
they  are  able  to  multiply  a  four  or  five  place  number 
by  any  single  digit,  and  finally  they  are  able  to  multiply 
any  integral  number  by  any  other  integer.  This  is  the 
end  of  the  development  in  multiplication  itself,  although 
further  training  in  the  use  of  this  skill  is  necessary.  A  test 
which  is  designed  to  measure  the  most  complex  form  in 
which  a  given  skill  is  found  is  a  test  of  the  end  product. 

The  significant  points  to  be  noted  in  the  foregoing 
discussion  are  two :  (1)  that  some  children  in  every  grade 
above  the  third  complete  their  development  in  multipli- 
cation as  far  as  their  maturity  permits;  (2)  that  each  type 
or  partial  phase  of  development  is  in  reality  a  distinct 
ability.    Each  of  these  points  will  be  discussed  further. 
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The  type  of  ability  selected  to  represent  the  end 
product  is,  of  course,  a  mere  matter  of  convention. 
The  convention  adopted  for  the  Courtis  Tests  is  that  in 
any  operation  the  units  selected  shall  be  the  smallest 
that  cover  all  the  essential  elements.  These  elements 
for  the  different  operations  are  as  follows: 

Elements  Covered  by  Type  of  Examples  Used  in  the  Arithmetic 

Tests  (Series  B) 


addition 


1  Knowledge  of 

Combina- 
tions 

2  Bridging    the 

Tens 

3  Carrying 

4  Attention  Span 

5  Fatigue 


subtraction 


1  Knowledge  of 

Combina- 
tions 

2  Borrowing 

3  Fatigue 


multiplication 


1  Knowledge  of 

Combina- 
tions 

2  Place  Value 

3  Carrying 

4  Addition 

5  Fatigue 


DIVISION 


1  Knowledge    of 

Combinations 

2  Place  Value 

3  Estimation  of 

the  Quotient 

4  Multiplication 

5  Subtraction 

6  Fatigue 


The  smallest  types  of  examples1  that  can  be  selected  to 
include  these  elements  in  their  simple  form  are  given  below: 


ADDITION 

SUBTRACTION 

MULTIPLICATION 

DIVISION 

927  297  136  486 
379  925  340  765 
756  473  988  524 
837  983  386  140 
924  315  353  812 
110  661  904  466 
854  794  547  355 
965  177  192  834 
344  124  439  567 

107,795,491 
77,197,029 

75,088,824 
57,406,394 

160,620,971 
51,274,387 

80,861,837 
25,842,708 

3,697 
73 

5,739 
85 

4,268 
87 

6,428 
58 

94)85,352 

37)9,990 

73)53,765 

49)81,409 

■Except  in  subtraction. 
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The  figures  in  these  examples  are  not  determined  by 
chance,  but  in  accordance  with  a  systematic  plan.  For 
instance,  in  the  multiplication  examples  the  reader  should 
notice  that  in  multiplying  3597  by  73  and  4268  by  37 
a  child  is  called  upon  to  use  every  one  of  the  combina- 
tions of  the  three  and  seven  tables  except  the  one  and 
the  zero  combinations,  for  2,  3, 4,  5,  6,  7,  8,  and  9  are  all 
represented  in  the  multiplicands.  In  similar  fashion, 
care  is  taken  to  test  all  combinations  and  situations 
throughout  the  tests,  the  combinations  omitted  being 
only  those  which  appropriate  tests  have  proved  are  of 
extreme  simplicity.  Equal  care  is  taken  in  all  of  the  tests 
to  cover  for  each  operation  every  factor  mentioned  above, 
and  enough  material  is  provided  to  keep  even  the  bright- 
est child  busy  for  at  least  four  minutes.  For  in  four 
minutes  the  average  child  will  reveal  any  marked  ten- 
dency to  make  errors  because  of  a  lack  of  control  of  those 
forces  which  tend  to  divert  attention  after  a  few  minutes 
of  continuous  activity  of  a  single  type,  forces  commonly 
described  by  the  word  fatigue. 

The  care  taken  in  the  construction  of  the  examples  for 
these  tests  makes  possible  the  construction  of  other  tests 
of  equal  difficulty  but  differing  in  every  answer.  At  the 
time  the  Gary  survey  was  begun,  three  such  editions 
were  in  general  use  throughout  the  country.  These 
were  known  respectively  as  Forms  1,  2  and  3.  Form  3 
was  used  at  Gary  for  the  first  test,  but  in  order  to  pre- 
vent any  possible  suggestion  that  there  might  have  been 
direct  preparation  for  the  tests,  a  fourth  edition,  Form  4, 
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was  prepared  and  used  for  the  first  time  at  Gary.    This 
was  done  before  the  tests  of  Form  3  had  been  scored. 

The  tests  were  given  under  precisely  the  same  condi- 
tions from  the  fourth  to  the  twelfth  grades,  inclusive. 
Even  in  the  fourth  grade,  children  were  found  who 
showed  by  their  scores  that  they  possessed  to  a  greater 
or  less  degree  all  the  abilities  measured  by  the  tests. 
For  instance,  the  percentages  of  fourth  grade  children 
who  equal  or  excel  the  eighth  grade  median  score  for 
examples  correctly  worked  are: 


ADDITION 

SUBTRAC- 
TION 

MULTIPLI- 
CATION 

DIVISION 

FORM 

3 

6.7 
60 

FORM 

4 

6.3 
60 

FO&M 
3 

1.5 

16 

lOftM 

4 

6.0 
69 

roue 
3 

1.0 
49 

lOUf 

4 

.01 
66 

3 

.06 
17 

vom 

4 

Per  cent,  of  fourth  grade 
children    equaling   or 
exceeding  eighth  grade 

Per  cent,  of  fourth  grade 
children   getting   one 
or  more  examples  right 

.08 
21 

These  figures  would  make  it  evident  that  the  tests 
measure  very  simple  skills,  the  teaching  of  which  is 
completed  in  most  schools  by  the  fourth  grade,  since, 
at  the  time  the  tests  were  given,  from  one  sixth  to  two 
thirds  of  the  children  "knew  how"  to  get  at  least  one 
example  right.  This  means  that  were  these  children 
given  time  enough,  they  could  complete  every  example 
in  the  test  and  get  every  example  right.  Therefore,  as 
given,  the  tests  measure  skill,  or  ability  to  do,  not  mere 
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knowledge  of  "how  to  do."  In  the  lowest  fourth  grade 
class  (4C)  there  are  none,  or  very  few  children,  who,  at 
the  beginning  of  the  year,  can  do  long  division,  but  by 
the  end  of  the  4A  classes,  a  knowledge  of  this  process  has 
also  been  acquired.  The  tests  are,  therefore,  measures 
of  the  end  product  of  teaching  effort,  and  the  changes  in 
scores  from  grade  to  grade  are  due  to  changes  in  skill, 
not  to  changes  in  knowldege. 

Increases  in  skill  from  grade  to  grade  are  determined 
by  three  sets  of  causes.  Part  of  it  is  due  to  increasing 
maturity  of  the  children  as  they  pass  through  the  grades.1 
Eighth  grade  children  are  certain  to  show  a  higher 
rate  of  work  than  fourth  grade  children,  simply  be- 
cause they  are  older  and,  consequently,  have  more 
highly  developed  nerves  and  muscles.  Part  of  it  is 
caused  by  daily  use  of  the  four  processes  in  arithmetic 
in  and  out  of  school.  A  child  is  under  a  steady  pressure 
from  his  teachers  and  his  school  work  to  perfect  his  skills, 
and  both  rate  of  work  and  accuracy  are  benefited. 
And  part  of  it  is  caused  by  teaching  effort.  If  a  teacher 
of  the  sixth  grade  discovers  that  a  particular  child 
does  not  know  certain  fundamental  combinations,  or 
has  faulty  habits  of  carrying,  or  does  not  know  how  to 
control  the  critical  pulses  of  his  attention,  she  may  make 
such  explanations  and  provide  such  training  that  the 
individual  will  overcome  his  peculiar  difficulties  and  rise 
at  once  to  new  levels  of  ability. 

The  second  point  to  be  noted  is  that  each  type  of 

1See  Chapter  VIII,  page  357. 


l82 


THE  GARY  SCHOOLS 


U3 


s 


1 


eI 


O  u 


2  B 
Pi 

a* 


o  W 

(0 


W  H 

§1 


as 

O  H 


§ 


6 


§ 


I 


J 


I 


I 


CI 


kO 


00 


CO 


3 


iO 


$ 


CO 


SI 


1  fc*    i  WOOh 
•   HNHH 


eifHf-toeo 


0JOO03OCa«^00b-CJ 
^■i  ci  oi  »-i  »H  »H  »H      »H^* 


OCO    I   rHOWNt*l*0 


vHCO<OCQC4  04  0*0*10*0 


C4OO^*OOOCaC4O>0>C* 


OiOOOOOOCOOW*^*^ 

I--!  «-i  cd  ei  ei  ei  ei  eo  co  <<4* 


«ocoo*,<*»oiOiotoco^' 


88S8888888 

1-1  1*T* 


fr-CO^jiiOiOtOfc'-iOftOftO 


o> 


t-o> 


S-*  ■*  "«•  ■*  to  t-  ©J 
vH  i-H  tH  *H  i-l  vH  04 


5824 

•  8.313 

a  a -a -a 


<CQUQH^mz|Hi-i 


5 


. 


*  ARITHMETIC 


i»3 


«8*Jcjj!J.g*9j 


I  -1-8  -  - 


"  8  "  •  el  I  a?-8  a? 


"J     ^      "*  ^  •»    ^T 


Is 


^  a°    as,* 

si !  fife"  si!  Will 


l 


■if* 


^i&isssssa' 


w  ^  ja  o» 


184  THE  GARY  SCHOOLS 

element  that  enters  into  the  end  product  is  itself  a 
distinct  ability.  Inferences  from  one  test  to  another 
must  be  made  with  extreme  caution.  The  organization 
of  individual  minds  varies  so  greatly  that  one  dare  not 
say,  before  an  actual  test  has  been  given,  that  because 
a  child  makes  a  high  score  in  such  examples  as  3768 
multiplied  by  74,  he  will,  therefore,  make  a  high  score 
in  a  test  composed  of  such  simple  elements  as  8  multi- 
plied by  4,  5  multiplied  by  7,  3  multiplied  by  6. 

For  instance,  the  three  Cleveland  tests  used  at  Gary 
consisted  of  examples  of  the  following  types. 

ABC 

8      6  3498  3498 

4      7  7  47 


Type  A  involves  only  knowledge  of  the  combinations. 
Type  B  requires  also  ability  to  carry,  while  type  C  covers 
all  the  elements  listed  previously.  While  the  median 
eighth  grade  score  in  Test  A  is  14  answers  per  half  minute 
with  an  accuracy  of  93  per  cent.,  the  individual  abilities 
range  from  7  answers  per  half  minute  with  an  accuracy 
of  100  per  cent,  to  22  answers  per  half  minute  with  an 
accuracy  of  100  per  cent.  (Table  XLII,  page  182).  It 
would  seem  on  first  thought  that  when  one  of  two  children 
has  three  times  the  knowledge  of  the  multiplication  tables 
that  the  other  possesses  he  must  have  correspondingly 
greater  ability  to  multiply  in  long  division  examples.  The 
results  show,  however,  that  it  is  the  child  with  the  least 
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ability  in  the  tables  that  makes  the  highest  scores  in  the 
complex  test.  Many  evidences  of  such  personal  idiosyn- 
crasies will  be  noted.  To  what  extent  chance  individual 
variation  is  the  cause  of  the  apparent  discrepancies 
in  the  particular  results  quoted  is  not  known,  but  from 
personal  experience  in  teaching  such  children  the  writer 
is  ready  to  vouch  for  the  existence  of  such  conditions. 

No  valid  conclusion  may  be  drawn  from  a  few  selected 
cases,  but  the  coefficients  of  correspondence  between 
these  tests  computed  on  the  basis  of  the  scores  of  42 
eighth  grade  children  present  for  all  tests  tend  to 
show  that  there  is  less  correspondence  between  the 
simple  and  complex  tests  (about  70  per  cent.)1  than 
between  the  various  trials  of  the  complex  tests  (about 
85  per  cent.)  as  far  as  rate  of  work  is  concerned,  while  for 
accuracy  the  correspondence  is  low  in  all  tests  (about 
50  per  cent.)  (Table  XLIII,  page  187). 

In  spite  of  this  lack  of  correspondence  between  individ- 
ual scores,  the  class  scores  of  the  identical  tests  agree 
almost  exactly.  The  median  rate  for  the  eighth  grade 
scores  was  8  examples  attempted  for  both  tests,  while 
the  median  accuracy  in  the  one  class  was  74  per  cent, 
and  in  the  other,  75  per  cent.  Similar  results  are  usually 
obtained  whenever  a  series  of  tests  of  the  same  type  of 
ability,  but  of  graded  complexity,  are  used  together.  Under 
the  circumstances,  it  is  safest  to  regard  each  test  as  meas- 
uring a  separate  ability,  and  to  make  no  inferences  from 
one  test  to  another,  except  on  the  basis  of  actual  results. 

*See  Chapter  XI  of  Appendix  A,  page  475. 
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COMPUTATION  OF  MEDIANS 

The  methods  of  tabulation  and  of  computing  medians 
need  explanation.  The  form  of  record  sheet  for  the 
Series  B  tests  was  the  standard  record  sheet  designed  for 
the  tests  and  in  general  use  throughout  the  country. 
The  method  of  computing  medians  was  the  approximate 
method  also  in  general  use  in  these  tests.  But  because 
of  the  low  scores  made  by  the  Gary  children,  this  method 
does  not  always  yield  exact  results.  The  differences, 
where  they  occur,  however,  are  in  the  main  favorable  to 
the  Gary  children,  tending  to  raise  their  scores  above  the 
true  level. 

An  illustration  will  make  this  plain.  In  Figure  32,  page 
189,  are  shown  two  forms  of  tabulation  sheet.  The  one 
at  the  left  of  the  figure  is  the  standard  sheet,  although 
not  arranged  in  the  conventional  manner.  The  sheet  is 
designed  to  show  the  relation  between  rate  of  work 
and  accuracy,  and  to  enable  the  accuracy  to  be  found 
readily  without  unnecessary  computation. 

The  rate  scores  are  shown  along  the  left  of  the  sheet, 
and  the  total  distribution  for  rate  of  work  in  the  column 
at  the  extreme  right.  This  distribution  is  not  distorted 
in  any  way,  and  from  it  the  true  median  may  be  found. 

The  actual  scores  made  by  a  child  are  in  terms  of 
examples  tried  and  examples  right,  as  6  tried  and  5 
right.  Such  scores,  however,  should  be  described  in 
terms  of  rate  and  accuracy  of  work.  Rate  is,  of  course, 
the  number  of  examples  finished  per  unit  of  time,  or, 
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TABLE  XLm 
RON  Between  Results  from  Different  Divisions  or 
Cleveland  Arithmetic  Tests 

plication  (Sets  C,  G  and  L).  Based  upon  Scores  of  Eighth 
Total  number  of  cases,  42.  Also  relation  of  Scores  in  Test 
,,  Cleveland  Arithmetic  Test)  to  Tests  D  and  E. 
1  E  are  two  trials  of  the  same  test  as  C,  except  that  the  time 
%  is  six  minutes  instead  of  three,  and  that  the  tests  were  given 
erent  time  and  as  part  of  the  Series  B  tests. 


AND 
ETIC 

MEDIAN 

MEDIAN 
DEVIATION 

TOTAL  RANGE 

r 

RATE 

ACCURACY 

RATE 

ACCURACY 

RATE 

ACCURACY 

14 

6 
5 

8 
8 

93.5 
86.0 
63.5 
74.0 
75.0 

3 
1 
1 
2 
1 

6.5 
14.0 
19.5 
15.0 
14.0 

7-22 
8-8 
1-7 
2-12 
2-13 

7&-100 

88-100 

0-100 

0-100 

0-100 

tags  of  Total  Cases  Which  Do  Not  Vary  in  Position 
Than  One  Unit,  or  One  Half  a  Unit  of  Variability 


RATE 

ACCURACY 

MPARISON 

X    UNIT 

}  UNIT 

X  UNIT 

i  UNIT 

nth  B 

71 

40 

60 

29 

"    C 

67 

36 

60 

21 

"    C 

98 

43 

60 

19 

"    D 

86 

62 

40 

26 

"    E 

81 

29 

38 

14 

"    E 

76 

60 

55 

24 

able  is  to  be  read  as  follows:  If  the  relation  of  the  scores  of  in- 
children  in  number  of  examples  attempted  in  Test  A  to  the 
number  made  by  the  class  as  a  whole  be  compared  with  the 
of  the  same  individual  scores  in  Test  B  to  the  median  number 
>les  made  by  the  class  as  a  whole,  71  per  cent,  of  the  children 
ound  to  have  maintained  the  same  relative  position  in  the  two 
scores  within  one  unit  of  variability.  That  is,  within  three 
i  in  Test  A,  or  one  example  in  Test  B. 
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more  simply,  the  number  of  examples  finished,  since  all 
the  grades  have  the  same  time  allowance.  The  accuracy 
is  the  relation  of  the  number  of  examples  right  to  the 
number  of  examples  tried,  expressed  as  a  rate  per  cent. 
To  avoid  the  computation  necessary  to  find  the  accu- 
racy in  each  case,  the  record  sheet  is  divided  into 
vertical  columns,  each  corresponding  to  a  given  range 
of  accuracy.  To  enable  the  scores  to  be  entered  directly 
on  the  record  sheet  without  the  necessity  of  this  com- 
putation, the  small  figures  under  the  rate  score  show 
for  each  rate  the  accuracy  column  in  which  the  given 
right  score  would  fall.  Thus,  to  enter  the  score  6 
examples  tried  and  5  right,  the  tabulator  would  look 
up  at  the  rate  column  until  6  was  found,  then  move 
horizontally  across  the  "six"  row  until  5  was  found. 
One  tally  would  then  be  marked  in  this  rectangle,  and 
a  glance  at  the  top  of  the  column  would  show  that  the 
accuracy  was  from  80  per  cent,  to  89  per  cent.,  average 
85  per  cent. 

The  accuracies  thus  found  are  only  approximate,  but 
for  all  columns  except  the  lowest  the  result  will  never 
differ  from  the  true  accuracy  by  more  than  5  per  cent, 
at  the  most.  In  the  illustration  above  the  difference 
is  2  per  cent.  (85-83).  On  the  average,  the  difference 
will  be  much  less  than  this  amount,  and  will  as  often 
be  positive  as  negative.  This  statement  does  not  apply 
to  the  last  column  where  the  variation  in  accuracy  may 
amount  to  25  per  cent.,  although  usually  it  is  less  than 
this  amount    However,  as  it  makes  little  difference  in 
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meaning  whether  an  accuracy  score  is  37  per  cent  or 
12  per  cent,  when  both  are  so  low  as  to  be  of  no 
practical  value,  the  errors  in  this  part  of  the  scale  are  not 
serious. 

The  source  of  the  error  lies  in  the  fact  that  scores  of 
widely  different  values  are  grouped  together.  If  the 
reader  will  refer  to  Figure  32,  he  will  see  that  three 
children  who  made  scores  of  6  examples  tried  have 
scores  for  examples  right  which  bring  them  into  the  lowest 
accuracy  column.  These  scores  may  be  either  2,  1,  or  o 
examples  right,  the  record  sheet  does  not  discriminate 
between  them.  On  the  average,  the  scores  will  be  1 
and  the  accuracy  16  per  cent.,  but  if  in  a  particular  case 
they  happen  to  be  either  2  or  o  the  actual  accuracy  will 
be  either  ^3  per  cent,  or  o. 

In  the  case  of  the  Gary  scores,  the  effect  of  such  lack 
of  discrimination  is  to  raise  the  accuracy  scores.  If  the 
conventional  distribution  of  the  scores  for  examples 
right  is  made  as  shown  in  the  tabulation  in  the  table  on 
the  right  of  Figure  32,  the  median  score  for  examples 
right  will  be  1.7  examples.  The  class  accuracy  based 
upon  the  median  score  of  five  examples  attempted  would 
accordingly  be  34  per  cent,  instead  of  52  per  cent.,  as 
shown  by  the  form  of  tabulation  used.  Such  large  differ- 
ences, however,  occur  infrequently.  To  check  these 
results,  the  scores  of  all  fourth  and  eighth  grade  classes 
were  tabulated  both  ways. 

For  instance,  in  addition,  in  79  per  cent,  of  the  cases 
the  differences  are  either  zero,  or  show  that  the  accuracy 


ARITHMETIC 
TABLE  XLIV 


191 


toss  in  the  Accuracies  of  Fourth  and  Eighth  Grade 

Classes1 


NUMBER  OF  TIMES 

TOTAL 
GASES 

SUM  OF 
DIFFER- 
ENCES 

AVERAGE 

TIONS 

STANDARD 
LAJtGI* 

UQBT 
lAIOXI 

DIFFER- 
ENCES 

fction. . . 

26 
27 
24 
28 

7 

6 
8 
6 

88 

88 
82 
29 

165 

242 
149 
183 

5.0 
7.8 

4.7 
6.8 

100 

27 

127 

789 

6.8 

lated  in  the  Standard  Record  Sheet  and  at  computed  tan  a  tabulation  of 
camples  right. 

ible  Is  to  be  read  as  follows:  33  class  scores  as  to  accuracy 
on  were  computed  in  two  ways.  One,  using  the  standard 
record  sheet  shown  in  Figure  32,  and  the  other,  tabulating 
cr  of  examples  correctly  worked  and  computing  the  accuracy 
median  number  of  examples  right  compared  with  the  median 
>f  examples  attempted.  In  26  cases  the  first  method  yielded 
alts;  in  7  cases  the  second  method  yielded  larger  results.  The 
le  33  differences  in  scores,  without  respect  to  sign,  is  165  per 
:  average  difference,  5  per  cent.  The  average  difference  for 
is  5.8  per  cent.;  that  is,  four  times  out  of  five  the  standard 
ields  scores  which  are,  on  the  average,  6  per  cent,  higher 
-  would  have  been  had  they  been  computed  from  the  number 
es  worked  correctly. 

mined  by  the  standard  record  sheet  is  higher  than 
etermined  from  the  number  of  examples  right. 
XLIV,  page  191).  The  average  amount  of  this 
ce  is  6  per  cent.  The  accuracy  scores  in  the 
>f  this  report,  therefore,  are  either  the  true  scores 

5  too  high  by  an  amount  which,  on  the  average, 

6  per  cent. 
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It  may  be  contended  that  the  median  accuracy  should 
have  been  computed  from  the  accuracy  of  the  individual 
papers.  That  this  is  better  is  conceded,  but  the  time 
cost  is  prohibitive,  and  the  advantage  small.  In  the  case 
of  the  class  shown  in  Figure  32,  page  189,  for  instance,  the 
median  class  accuracy  computed  from  the  individual 
accuracies  is  50  per  cent,  as  compared  with  52  per  cent 
computed  from  the  standard  record  sheet  and  34  per 
cent,  computed  from  the  median  number  of  examples 
right.  When  the  median  accuracy  does  not  fall  much 
below  50  per  cent,  the  standard  method  is  much  to 
be  preferred  because  it  preserves  the  scores  in  their 
fundamental  relationships,  and  the  results  will  differ 
very  little,  if  at  all,  from  those  obtained  in  the  longer, 
but  more  accurate  method.  Moreover,  to  be  comparable 
with  the  results  from  other  cities,  the  Gary  results  must 
have  been  obtained  by  the  same  methods.  For  these 
reasons  the  standard  method  was  used  at  Gary  in  both 
the  Series  B  and  the  Cleveland  Tests.  In  the  case  of 
the  Cleveland  Tests,  however,  to  make  comparisons 
possible,  it  was  necessary  to  tabulate  all  classes  in  both 
ways. 

RELIABILITY     OF     MEASUREMENTS 

As  the  Series  B  tests  were  given  twice,  the  data  afford 
a  basis  for  the  discussion  of  the  reliability  of  a  single  test. 
For  instance,  the  difference  in  the  scores  of  the  13  classes 
in  the  Jefferson  School  which  were  tested  twice  with 
each  of  the  four  tests,  52  differences  in  all,  is,  for  the  most 
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Figure  $$ 
Differences  in  Class  Scores  in  Jefferson  School  for  Two 

Trials  of  Series  B1 
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Each  quarter  of  the  diagram  is  a  graph  of  the  results  for  one  operation. 

The  scale  along  the  base  of  the  figures  shows  the  numbers  of  the  classes. 
In  each  diagram  the  straight  line  marked  O  represents  the  score  made  in 
the  first  trial.  The  scale  along  the  left  hand  vertical  axis  shows  the 
number  of  examples  the  rate  score  in  the  second  trial  is  greater  or  less 
than  the  corresponding  score  in  the  first  trial.  The  scale  along  the  right 
hand  vertical  axis  shows  the  number  of  per  cent,  the  accuracy  score  in 
the  second  trial  is  greater  or  less  than  the  corresponding  score  in 
the  first  trial.  The  solid  line  represents  differences  in  rate.  The  dotted 
line  represents  differences  in  accuracy. 

It  should  be  noted  (i)  that  there  are  no  consistent  differences  in  any  of 
the  diagrams,  (2)  that  a  gain  in  rate  is  often  accompanied  by  a  loss  in 
accuracy,  and  vice  versa,  indicating  that  the  changes  in  score  are  merely 
fluctuations  in  the  methods  of  work,  (3)  that  the  differences  are  gross  dif- 
ferences caused  by  changes  in  class  membership,  of  changes  in  ability  due  to 
training,  or  changes  caused  by  any  other  factors  that  may  be  operating. 

1The  tests  were  given  five  weeks  apart. 
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Freuai  3j    CmtUntui 
In  classes  9, 10,11,  and  12  in  multipUcatkn,  the  cams  prohtlij  M» 


cate  growth  due  to  training,  bat  throughout  the  remainder  of  the  di- 
grams there  it  little  evidence  of  either  growth  or  of  consistent  dffienm 
in  difficulty  between  the  two  editions  of  the  tests. 

part,  one  example  or  less  in  the  number  of  examples  at- 
tempted and  10  per  cent  or  less  in  accuracy.  Only 
about  one  difference  in  five  will  exceed  these  limits 
(Table  XLV,  page  19a),  and  55  oat  of  102  differences  are 
positive.  That  is,  the  scones  tend  to  be  slightly  higher 
on  the  second  trial. 

A  careful  study  of  these  data,  however,  shows  the 
variations  are  of  two  types.  In  same  of  the  dassts 
changes  in  rate  and  accuracy  are  in  opposite  directions, 
in  others  the  two  are  in  the  same  direction,  and  in  itfll 
others  there  is  practically  no  change  (Figure  33).  The 
results  show  plainly  that  a  number  of  different  facton 
are  at  work. 

A  factor  that  might  cause  change  in  scores  is  a  change 
in  difficulty  from  test  to  test  In  the  first  trial,  Form 
3  of  the  Series  B  test  was  used,  while  in  the  second  trial 
Form  4  was  used.  Form  4  is  constructed  to  be  of  equal 
difficulty  with  Form  3  on  an  objective  basis;  that  is,  the 
same  combinations  were  employed  throughout,  and,  as 
nearly  as  possible,  in  the  same  arrangement  The  varia- 
tion in  difficulty  from  one  form  to  the  other  should  not 
be  large,  but  it  is  quite  impossible  to  check  the  relative 
difficulty  of  the  tests  except  by  very  carefully  conducted 
experiments.  ' 
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In  the  results  shown  in  the  previous  table  it  will  be 
seen  that  the  classroom  scores  do  not  differ  in  any 
characteristic  way.  Sometimes  the  scores  from  Form  3 
are  larger  than  those  from  Form  4;  sometimes  they  are 
smaller.  These  are  indications  that  the  differences  are 
not  caused  by  any  marked  differences  in  the  difficulty 
of  the  tests  themselves.  As  a  further  check  upon  the 
differences  from  test  to  test,  the  number  of  times  each 
example  was  missed  in  Trial  1  and  Trial  2  is  tabulated. 
Some  of  the  examples  of  Form  4  proved  to  be  missed  by  a 
smaller  number  of  children  than  in  Form  3,  and  the  others 
by  a  larger  number,  depending  upon  the  operation 
(Table  XL VI,  page  198).  The  average  difference  per 
example  was  2.6  per  cent.  In  60  per  cent,  of  the  cases  the 
differences  were  positive. 

There  are  also  evidences  of  differences  in  difficulty 
from  examples  to  example.  It  must  be  remembered, 
however,  that  the  children  who  complete  the  various 
examples  are  a  different  group,  as  only  the  most  able 
children  reach  the  later  examples  (Table  XLVII,  page  200) . 
The  results  show  that  the  units  of  which  the  tests  are 
composed  are  fairly  equal  as  measured  by  even  the  small 
number  of  scores  at  Gary  and  that  the  differences  from 
test  to  test  are  not  very  great.  Those  who  feel  inclined 
to  question  the  equality  of  the  units  of  which  the  tests 
are  composed,  or  the  equality  of  Forms  3  and  4,  should 
remember  first  that  each  group  of  four  examples1  of  the 
addition  tests  call  for  the  use  of  the  same  combinations, 

The  number  is  different  for  each  operation. 
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Table  XLVII— Continued 

ible  is  to  be  read  as  follows:  Example  1,  multiplication  test, 
was  missed  by  30  per  cent,  of  the  ninth  grade  children,  by  24 

of  a  group  of  33  ninth  grade  children  who  finished  the  entire 
iples,  by  47  per  cent,  of  a  group  of  36  xoth  to  12th  grade  children, 
:r  cent,  of  a  group  of  fifth  grade  children.  In  other  words,  the 
jplies  four  determinations  of  relative  difficulty  of  the  first  four 
1,  based  upon  the  performances  of  groups  of  children  of  different 

It  should  be  noted  that  there  are  marked  variations  in  the  rela- 
iculty  of  the  different  examples  as  determined  by  the  different 

Thus,  example  3,  Form  3,  is  missed  in  the  ninth  grade  by  less 
f  as  many  children  as  example  x,  but  in  the  group  of  33  ninth 
Jldren  who  completed  xo  examples  the  numbers  of  children 
the  1st  and  3rd  examples  are  equal.  Similarly,  for  Form  4,  the 
lple  is  harder  than  the  1st  example,  according  to  the  results  of 
group,  nearly  equal  according  to  the  results  of  the  second  group, 
cording  to  the  results  of  the  third  group,  and  harder  according 
suits  of  the  fourth  group.  In  other  words,  the  determination 
st  difficulty  of  units  of  small  groups  of  children  within  any  one 
rstem  does  not  yield  consistent  results. 

:  the  four  examples  taken  together  make  one  unit; 
,  that  Form  4  is  made  directly  from  Form  3  so 
ny  child  taking  one  form  and  then  the  other  is 
upon  to  do  exactly  the  same  work  (Figure  34,  page 
rnd,  third,  that  the  data  in  the  table  are  too  few  to 
ite  the  effects  of  chance  and  individual  variation, 
ver,  even  the  Gary  class  results  previously  dis- 
show  that  the  variations  in  class  scores  from  one 
o  another  are  relatively  insignificant.  Under  the 
stances,  the  discussion  above  shows  that  in  spite 
he  factors  operating  to  produce  variation,  the  two 
re  equal  within  about  5  per  cent,  or  less. 
of  the  most  important  factors  causing  a  difference 
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in  score  is  the  difference  in  the  reaction  of  the  children 
to  the  test  situation.  The  first  time  the  test  is  taken 
most  individuals  tend  to  hold  themselves  in  restraint 
They  proceed  cautiously,  on  the  lookout  for  difficulties. 
On  the  second  trial,  however,  when  they  know  what  to 
expect,  they  work  more  freely.  The  result  is  an  in- 
creased rate  and  decreased  accuracy.1 

Many  illustrations,  both  of  increase  in  rate  and  de- 
crease in  accuracy,  and  increase  in  accuracy  and  decrease 
in  rate  were  found.  (Tables  XLV  and  XLVIII.)  All 
such  differences  will  be  said  to  be  caused  by  a  change 
in  method  of  work.  The  amount  of  change  in  rate 
necessary  to  produce  a  given  change  in  accuracy  is, 
however,  not  known  and  apparently  varies  with  the 
different  classes  and  different  individuals.  Consequently 
in  all  cases  of  change  of  method  it  is  impossible  to  tell 
whether  or  not  there  has  been  any  real  change  of  ability 
in  the  interval  between  tests. 

A  third  factor  which  undoubtedly  influences  many 
scores  is  the  change  in  class  membership.  Attention 
has  already  been  called  to  the  fact  that  attendance  at 
Gary  is  exceedingly  variable,  and  in  the  results  previously 
given  (Table  XLV,  page  192)  no  account  has  been  taken 
of  changes  in  membership.  An  analysis  was  made  of  the 
scores  of  the  children  from  a  class  whose  scores  show  a  large 
loss  to  determine  the  effect  of  individual  variation  (Table 
XLVIII,  page  204) .  There  were  3  children  present  at  the 
first  trial  who  were  absent  on  the  second,  10  children  pres- 

*See  X  of  Appendix  A,  page  452. 
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ent  the  second  trial  who  were  not  present  the  first  trial. 
Of  the  23  children  present  at  both  trials,  the  scores  of  5 
show  a  real  gain,  some  of  them  in  both  rate  and  accuracy, 
1 1  individuals  show  very  little  change,  or  else  a  change  in 
method.  That  is,  in  this  class,  the  large  loss  in  accuracy  is 
due  almost  entirely  to  the  change  of  method  of  work,  for 
the  children  present  at  both  trials  have  increased  their  rate 
score  one  example,  and  lost  in  accuracy  24  per  cent. 
This  effect,  however,  is  somewhat  masked  in  the  general 
class  scores  by  the  fact  that  children  present  at  the  first 
trial  were  less  able  than  the  average  of  the  class,  and 
tended  to  reduce  the  first  class  score. 

A  similar  analysis  was  made  for  a  class  in  which  there 
was  a  larger  gain  in  accuracy  (Table  XLDC,  page  206). 
The  results  show  that  this  gain  was  due  partly  to  seven 
children  who  show  a  real  gain,  and  partly  to  the  fact  that 
&  number  of  children  worked  more  slowly  in  the  second 
test  with  increased  accuracy,  that  is,  to  eight  children 
who  show  an  apparent  gain  in  accuracy  due  to  change 
in  method.  Analysis  of  random  selections  of  other  classes 
showing  similar  large  differences  gave  similar  results. 
Nothing  was  found  to  indicate  that  there  were  any  real 
differences  in  difficulty  between  the  two  addition  tests. 

The  fourth  factor  tending  to  produce  change  in  score  is 
school  training.  This  is  particularly  true  in  the  Beve- 
idge  school  in  which  systematic  drill  work  was  observed 
following  the  first  test.  In  most  of  the  classes,  however, 
here  is  no  evidence  that  any  great  amount  of  change 
>f  score  can  be  attributed  to  such  cause.    Most  of  the 
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variations  shown  in  the  table  are  due  either  to  chance 
variations  or  to  changes  in  method. 

A  tabulation  of  all  the  differences  in  the  class  scores  in 
the  two  trials  of  the  Series  B  tests  in  all  schools  showed 
(Table  L,  page  211)  that  about  10  per  cent,  of  the  classes 
made  exactly  the  same  score  in  the  two  tests,  78  per  cent, 
of  the  classes  made  the  same  score  in  rate  within  one 
example,  62  per  cent,  made  the  same  score  in  accuracy 
within  10  per  cent.  As  the  scores  of  the  Gary  children 
of  the  eighth  grade  average  8  examples,  a  change  of 
one  example  worked  correctly  would  mean  a  change  in 
accuracy,  if  the  rate  scores  were  constant,  of  12  per  cent, 
so  that  about  three  fourths  of  the  class  scores  do  not 
vary  more  than  one  example  in  their  rate  or  accuracy. 
The  median  of  211  differences  between  the  class  scores 
of  the  first  and  second  trials  of  the  tests  is  about  one  half 
an  example  in  rate  and  7.6  per  cent,  in  accuracy. 

Of  these  differences,  the  positive  differences  have  a 
ratio  to  the  negative  differences  of  about  2:1,  indicating 
the  general  tendency  of  the  classes  to  make  a  higher  score 
on  the  second  test.  This  is  undoubtedly  due  to  the 
practice  effects  of  repeating  the  test,  as  has  already  been 
indicated. 

The  reliability  of  the  rate  scores  of  a  single  test  is 
high — about  90  per  cent.1  That  is,  only  about  10  per 
cent,  of  the  children  will  be  misrepresented  by  their  rate 
scores  in  any  one  test.    In  accuracy,  however,  variations 

1  Pearson's  coefficient  of  reliability,  42   eighth  grade  children,  rate 
scores  addition,  Trials  i  and  2,  were  +  .90,  P.  E.  ±  .02. 
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Sekiks  B — Fous 


amount  or  variation* 

FREQUENCY,   BATE 

FREQUENCY, 
ACCURACY 

anon 

% 

■roan 

% 

0 

1-4 
5-9 
10-14 
16-19 
20-24 

19 
81 
64 
80 
8 
9 

10 
38 
30 
14 
4 
4 

24 
54 
54 
44 
19 
16 

12 
25 

25 
21 
9 
8 

ToUl 

211> 

100 

211« 

100 

Median 

.56  Examples 

7.6% 

■For  rate  of  work  the  amount  of  million  npments  teotht  ol  u  eiample.    For  ac- 
curacy the  amount  of  variation  rcpreaenta  per  cent,  of  accuracy. 
*7jacatca  were  lower  on  Mcond  trial,  no  higher. 

This  table  is  to  be  read  as  follows:  Of  ail  repeated  tests  ig,  or  io 
pet  cent,  of  the  whole  made  exactly  the  same  median  score  in  rate  in 
both  trials:  14,  or  12  per  cent,  made  exactly  the  some  median  score  in 
accuracy.  Half  the  classes  varied  .55  of  an  example  or  less  in  rate, 
and  7.6  per  cent,  or  less  in  accuracy. 


arise  much  more  readily  and  the  reliability  of  the  tests 
is  low.  This  is  shown  clearly  by  the  coefficients  of  corre- 
spondence (Table  LI,  page  213).  About  three  fourths  of 
the  children  will  maintain  their  relative  positions  in  the 
distribution  of  rate  scores  through  the  two  trials,  and 
about  40  per  cent,  of  the  children  in  accuracy.1 


5  coefficient  of  reliability,  43  eighth  grade  children,  accuracy 
team  multiplication.  Trials  1  and  a,  was  +  .ia,  P.  E.  ±  .09. 
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It  cannot  be  too  strongly  emphasized  that  educational 
measurements  differ  from  measurements  in  the  physical 
sciences  chiefly  in  the  fact  that  the  quantities  measured 
in  education  vary  enormously  with  slight  changes  in 
conditions.  The  length  of  a  metal  rod  is  changed  so 
little  by  temperature,  pressure,  and  other  factors  that 
we  come  to  think  of  the  length  as  a  quantity  independent 
of  the  conditions.  Ability  to  add  correctly,  on  the  other 
hand,  is  so  dependent  upon  the  conditions  under  which 
the  adding  is  done  that  it  can  scarcely  be  said  to  exist 
independently  of  them.  In  other  words,  as  has  been 
repeatedly  pointed  out,  a  test  does  not  measure  ability, 
it  merely  registers  performance  under  the  given  condi- 
tions. 

The  conditions  revealed  by  the  tests  are  not  created 
by  them.  Whenever  two  separate  measurements  by 
the  same  test  are  possible,  a  greater  range  of  individual 
variation  than  one  would  expect  is  always  revealed. 
Many  tests,  however,  cannot  be  repeated,  and  for  very 
few  are  there  available  second  editions  of  equal  value 
to  the  first.  The  conclusion  to  be  drawn  from  the  re- 
peated tests  is,  of  course,  that  a  series  of  tests  is  necessary 
to  determine  with  any  certainty  the  ability  of  an  in- 
dividual, but  that  variations  in  individual  achievement 
take  place  in  accordance  with  such  fixed  laws  that  class 
scores  based  on  the  scores  of  groups  of  children  are  very 
reliable.  In  other  words,  the  scores  made  by  the  Gary 
children  in  the  arithmetic  tests  are  proved,  by  the  repeti- 
tion of  the  tests,  to  represent  actual  conditions.    The 
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Correspondence  Between  Results  of  Two  Trials  of 

Series  B1 
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MEDIAN 

MEDIAN 
DEVIATION 

TOTAL  RANGE 

SATS 

ACCY. 

SATS 

ACCY. 

RATS 

ACCY. 

Addition.  1st 

8 

7.6 

9 

9 

8 

8 

7 

7 

6 

61 
67 
83 
79 
74 
76 
83 
84.6 

63.6 

2 

1.5 

1 

1 

2 

1 

1 

1 

1 

14 
14 
8 
12 
15 
14 
17 
13.5 

19.5 

4-15 
3-15 
6-16 
6-17 
2-12 
2-13 
1-13 
3-14 

1-7 

0-100 

Multiplication,  1st 

Multiplication,  2nd. . .. 

Division,  2nd 

0-100 
14-100 
0-100 
0-100 
0-100 
0-100 
20-100 

Multiplication,     Cleve- 
land Tests 

0-100 

1B*sed  on  scores  of  4*  eighth  grade  pupils,  tested  at  intervals  of  four  weeks. 

Percentage  of  Total  Cases  Which  Do  Not  Vary  in  Relative 
Position  More  Than  One  (or  One  Half) 
Unit  of  Variability 


series  b 

RATE 

ACCURACY 

COMFABmm  TRIAL  I  WITH  9 

X  UNIT 

}  UNIT 

I  UNIT 

fUNlT 

Addition 

78 
78 
76 
60 

86 

81 

65 
36 
50 
24 

62 

29 

36 
45 
66 
31 

40 

38 

21 

Multiplication. 

29 
24 

Multiplication  Trial  1  Series  B 
Multiplication  Trial  2  Series  B 

12 
26 

14 

table  is  to  be  read  as  follows:  If  in  the  first  trial  of  the  addition 
test,  the  relations  of  the  scores  of  individual  children  to  the  score  of 
the  class  as  a  whole  be  compared  with  similar  data  based  upon  the 
scores  of  the  same  individual  in  the  second  trial,  78  per  cent,  of  the 
children  will  be  found  to  have  maintained  the  same  relative  position 
in  the  two  sets  of  rate  scores,  and  36  per  cent,  in  the  two  sets  of  ac- 
curacy scores,  within  one  unit  of  variability.  That  is,  within  2  ex- 
amples for  rate,  and  within  14  per  cent,  for  accuracy. 
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scores  made  by  the  Gary  children  in  all  the  tests  of  me- 
chanical skills  agree  in  showing  that  the  Gary  children 
work  slowly  and  very  inaccurately. 

EFFECT  OF  TEST  CONDITIONS 

The  claim  is  sometimes  made  that  such  poor  work  in 
the  tests  is  due  largely  to  the  fact  that  the  children  are 
working  under  artificial  conditions;  that  if  occasion  were 
to  arise  for  the  use  of  these  same  skills  in  the  achievement 
of  some  purpose  which  seemed  worthy  of  effort,  the  chil- 
dren would  respond  to  the  motivated  situation  in  a  man- 
ner which  would  prove  their  ability  to  cope  with  it. 

This  statement  is  both  true  and  untrue.  It  is  true 
that  a  slow,  inaccurate  worker  will,  under  the  spur  of 
sufficient  incentive,  repeat  his  computations  many  times 
until  he  is  finally  able  to  arrive  at  the  correct  results. 
As  has  already  been  pointed  out,  by  far  the  greater 
number  of  children  tested  both  at  Gary  and  in  other 
school  systems  would  be  able  to  solve  correctly  every 
example  in  every  test  (except  certain  of  the  fraction 
tests)  if  a  sufficient  incentive  should  lead  them  to  at- 
tempt such  an  achievement,  and  if  there  were  no  time 
limit.  So  far  the  claim  above  is  based  upon  fact.  The 
part  of  the  statement  that  is  not  true  is  the  implication 
that  because  two  groups  of  children  attain  the  same 
goal  their  achievements  are  equal,  and  the  trainings 
which  made  the  achievement  possible  are  of  equal  value. 
For  if  the  children  of  one  school  require  less  time  than 
those  of  another  school  to  accomplish  the  same  task, 
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they  are  more  skilful.  The  sole  purpose  of  the  arith- 
metic tests  given  at  Gary  was  to  determine  the  degree  of 
skill  possessed  by  the  Gary  children  under  the  given 
conditions.  Such  measurements  are  of  value  because 
many  persons  have  already  determined  for  themselves 
the  degree  of  skill  they  think  a  child  of  a  given  age  or 
grade  should  have  under  the  test  conditions.  The  one 
additional  point  to  be  noted  is  that  in  the  judgment  of 
the  survey  staff  the  test  is  a  suitable  measure  for  the 
type  of  instruction  found  in  the  classrooms  at  Gary. 
If,  therefore,  the  reader  will  realize  that  no  attempt  is 
being  made  to  draw  inferences  from  the  results,  other 
than  in  regard  to  the  degree  of  skill  in  the  mechanical 
operations  of  arithmetic,  the  discussions  of  this  chapter 
will  serve  to  show  the  degree  of  dependence  that  may 
legitimately  be  placed  upon  the  conclusions  reached. 


VI.    ENGLISH  COMPOSITION 
§i.    General  Results 

LANGUAGE  work  at  Gary  is  allotted  16  per  cent 
of  the  total  time  given  to  the  fundamentals,  and 
—  this  corresponds  almost  exactly  to  the  per  cent 
of  time  allowed  this  type  of  work  in  the  average  of 
fifty  American  cities.  The  actual  number  of  hours  at 
Gary,  798,  is  somewhat  less  than  the  average,  864  hours, 
in  the  conventional  schools,  but  the  difference  is  so  slight 
that  one  may  fairly  say  work  of  this  type  receives  the 
same  emphasis  at  Gary  as  elsewhere.  English  composi- 
tion is  the  one  phase  of  language  work  which  is  measurable 
at  present.  Such  surveys  as  have  been  made  in  other 
school  systems  show  that  in  general  the  products  of  school 
training  in  written  composition  are  far  from  satisfactory. 
It  was  felt,  however,  that  here,  if  anywhere,  the  training 
peculiar  to  Gary  should  produce  results.  Accordingly, 
a  test  of  ability  in  English  composition  was  given. 

TESTS 

No  attempt  was  made  to  measure  oral  composition, 
and,  of  the  four  recognized  forms  of  written  composition, 
the  testing  work  was  limited  to  the  simplest,  narration. 
Children  were  asked  to  write  a  story  of  some  interesting 
or  exciting  experience1  of  their  lives.     Subjects  were 

1The  instructions  and  conditions  of  the  Composition  Test  given  in  the 
Denver  Survey  were  followed  closely. 
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suggested,  but  the  children  were  urged  to  choose  for 
themselves,  and,  for  the  most  part,  they  did.  Children 
wrote  freely  in  the  presence  of  the  examiners  and  were 
given  ample  time  (fifteen  to  twenty  minutes).  The  actual 
number  of  minutes  and  seconds  taken  by  each  child  was 
noted,  however. 

SCORING 

In  all  but  the  lowest  grades  the  children  counted  the 
number  of  words  written  and  later  these  scores  were 
verified  by  the  examiners.  The  papers  were  also  scored 
for  quality,  for  number  of  errors,  and  in  other  ways, 
all  such  scoring  being  done  by  trained  men  under  carefully 
controlled  conditions.  The  quality  of  the  compositions 
was  measured  by  means  of  the  Hillegas  Scale.1 

RESULTS 

A  typical  eighth  grade  composition  is  shown  in  Figure 
35.  The  handwriting  in  the  sample  is  a  little  better 
than  the  eighth  grade  median  in  handwriting  (quality 
Ayres'  Scale  45,  actual  eighth  grade  median  in  composi- 
tion test  39,  generalized  score  42),  and  the  misspell- 
ings are  less  frequent  than  for  the  median  spelling 
paper  for  the  eighth  grade  (spelling  coefficient  of  the 
illustration,  8;  median  eighth  grade  spelling  coefficient, 
19.7),  but  in  style,  subject,  structure,  and  range  of  vocab- 
ulary it  is  a  representative  paper.    So  far  as  it  is  not 


1 A  discussion  of  the  reliability  of  measurement  of  quality  in  composi- 
tioo  by  the  Hillegas  Scale  will  be  found  in  { i,  page  147. 


i8  THE  GARY  SCHOOLS 

Figure  35 
mrn*.        THE  CART  PUBUC  SCBOCU    .^T&F 
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The  scores  on  the  Hillegas  Scale  assigned  to  this  sample  by  five 
judges  are  37, 4°.  45. 55  and  65.  Median  value  45-  Median  quality  of 
eighth  grade  composition  for  entire  city,  45.8.  This  sample  is,  therefore, 
a  representative  paper. 

Half  of  the  eighth  grade  children  wrote  compositions  equal  to  or  better 
than  this  sample,  and  half  compositions  equal  to  or  worse  than  this  sample. 
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Fig  n  he  35— Continual 

Sample  A 
(FIillbcas  Value  45) 
An  Exciting  experience 

s  a  very  dark  damp  outside   we    ran  out  and 

tiie  woods,  We  went  there  stood  Earle  horrified 

after  watching  the  because     as     we     looked 

liter  we  had  been  through  the  trees  down  the 

bout  three  hours  we  beach  came  a  sort  of  a 

'akened  by  a  queer  misty  red  ball  tearing  up 

g  noise,  at  first  we  everything  in  front  of  it. 

-aid  we  talked  over  It  turned  toward  us  but  it 

e  would  do  if  the  was    coming    slowly    we 

lew     away.       The  turned  and  ran  into  the 

?as  growing  worse  tent  put  out  the  stove  and 

came  down  in  tor-  ran  for  some  bushes  be- 

d  came  through  into  tween    some    large    trees. 

as  a  deep  mist,  we  We  got  there  just  in  time 

so  wet  it  was  no  because  when  we  dropped 

xy  to  sleep  so  we  into  the  bushes  we  heard 

our  selves  over  the  a   crash   and    behind    us, 

stove,  the  storm  When  we  looked  to  see  the 

>rse  and  Earle  went  rotten  tree  beside  the  tent 

loosened  the  guy  had  been  torn  down  and 

ghtening  the  tent  unfortunatly    fell    on    the 

t  the  same  time,  tent  tearing  the  canvas  and 

ed  an  hour  before  breaking  the  center  pole. 

:  back  when  all  of  a  That  is  the  last  hurrican 

re  heard  a  yell  from  we  ever  want  to  see 
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Figuu  36 

Sample  B 

Past  or  an  Eighth  Grade  Composition 
(Hillegas  Value  30) 


As  we  went  to  spent  our 
vacation  I  happen  to  be 
right  near  the  mountains 
I  was  glad  couse  I  could  go 
and  climb  just  as  higch  as 
I  want  to 

So  I  went  with  my  father 
and  mother  we  went  pvery 
hiegh.    it  was  getting  cold 


already  why  I  think  abouve 
the  clouds  I  want  to  rich 
the  tops  but  couldnt  couse 
there  was  ice  and  it  was 
so  sleapry  to  goe  any 
further  so  we  came  bad 
when  we  came  down  there 
was  many  more  mountains 
and   I   disided    to   go  on 


Approximately  10  per  cent,  of  the  eighth  grade  compositions  were 
as  poor  as  this  sample.  Approximately  half  of  the  fourth  grade  paper* 
were  as  poor  as  or  worse  than  this  sample. 
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Figure  $6— Continued 

Sample  C 

Eighth  Grade  Composition 
(Hillegas  Value  60) 


One  evening,  about  nine 
years  ago,  we  left  Rochester 
N.  Y.  for  a  trip  up  the  St. 
Lawrence  River.  There 
were  four  in  our  party,  a 
cousin,  Mother,  Father,  and 
myself.  Going  up  the  river 
it  was  rather  rough  and  it 
being  my  first  experience 
on  a  boat  I  was  rather  sick, 
altho  the  thot  of  sleeping 
in  a  boat  all  night  was  a 
most  interesting  one  for  me. 

Coming  home,  the  second 
evening  we  had  been  on 
the  boat,  we  were  all  sitting 
on  the  deck.    The  sun  was 


just  setting  in  the  west. 
The  reflection  on  the  water 
was  made  up  of  goldenpink 
and  a  little  bit  of  red.  The 
water  on  one  side  of  the 
boat  was  like  glass.  On 
the  other  side  it  was  rip- 
pling just  a  trifle  and  the 
reflection  seemed  to  be 
lavender.  This  was  one  of 
the  most  beautiful  sights  I 
had  seen  and  as  young  as 
I  was  I  remember  it  yet. 
I  sat  and  gazed  upon  the 
water,  with  my  doll  in  my 
arms,  until  I  fell  asleep. 


Approximately  10  per  cent,  of  the  eighth  grade  papers  were  equal  to  or 
better  than  this  sample.  Approximately  half  of  the  twelfth  grade 
papers  were  equal  to  or  better  than  this  sample.  Approximately  80  per 
cent,  of  the  eighth  grade  papers  would  fall  between  Sample  B  and 
Sample  C. 

representative,  the  sample  is  better,  not  worse,  than  the 
typical  eighth  grade  product.  The  specimen  therefore  pro- 
vides the  reader  with  the  opportunity  to  see  for  himself  the 
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degree  of  merit  in  English  composition  which  represents 
the  outcome  of  eight  years9  training  in  writing  English. 

The  amount  of  progress  from  the  fourth  to  the  twelfth 
grades  is  from  quality  29.9  to  62.2  Hillegas  (Table LII)  and 
may  be  observed  in  the  difference  between  samples  "B" 
and  "  C "  in  Figure  36,  page  220.  Sample  "B  "  represents 
the  median  quality  of  the  fourth  grade  papers,  and  sample 
"C"  the  median  quality  of  the  twelfth  grade  papers. 
Approximately  80  per  cent,  of  the  eighth  grade  papers 
fall  between  these  limits ;  that  is,  a  little  less  than  10  per 
cent,  of  the  eighth  grade  papers  are  as  poor  as,  or  poorer 
than,  sample  "B,"  and  no  more  than  10  per  cent,  of  the 
eighth  grade  papers  are  equal  to,  or  exceed,  sample  "C." 

A  study  of  the  complete  distribution  of  the  scores 
assigned  to  the  papers  in  the  Gary  composition  test  show 
that  of  the  entire  group  of  1429  children  tested,  but  29 
have  papers  of  quality  70  or  better,1  and  all  but  one  of 
these  are  found  in  the  high  school  grades  (Table  LIE, 
page  224).  This  means  that  very  few  of  the  children 
have  much  power  in  the  selection  of  subject  matter,  the 
organization  of  material,  or  the  choice  of  words. 

Composition  ability  changes  in  rate  in  a  uniform  way 
up  to  the  ninth  grade,  but  there  is  little  further  progress. 
In  quality  there  is  but  3  points  difference  (32.8-29.9) 

Reference  to  the  analysis  of  the  composition  scale,  page  245,  and  to 
the  samples  in  Appendix  A,  page  416,  will  show  that  quality  70  is  the 
lowest  quality  that  represents  a  desirable  end  product.  It  is  only  fair  to 
Gary  to  add  that  standards  based  on  the  actual  achievements  of  children 
fall  far  below  70.  Trabue  eighth  grade  standard,  50.  Nineteen  per  cent, 
of  the  1429  pupils  equaled  or  exceeded  quality  50.  If  eighth  and  higher 
grades  only  are  considered  the  percentage  50  or  better  would  be  54%* 
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TABLE  LII 
Quality  of  Illustrative  Samples— English  Composition 


SAMPLE 

source: 

GRADE 

median  quality 

SCALE 

(6  judges) 

QUALITY  EQUALED 
OR  EXCEEDED   BY 

B 
A 
C 

Froebel 
Emerson 
Jefferson 

8th 
8th 
8th 

30,  Av.  Dev.8.6 
45,  Av.  Dev.  8.6 
60,  Av.  Dev.  7.6 

60%  of  4th  grade 
60%  of  8th  grade 
60%  of  12th  grade 

GRADE 

4 

8 

12 

39.1 
34.4 
29.9 
24.9 
18.8 

60.0 
60S 

46.8 
41.4 
34.2 

Median 

62.2 

Lowest  10% 

48.3 

The  table  is  to  be  read  as  follows:  Sample  B  was  written  by  an 
eighth  grade  child  in  the  Froebel  School.  It  is  rated  as  value  30  on  the 
Hillegas  Scale  (average  deviation  of  five  judgments,  3.6  points).  Its 
quality  is  equaled  or  exceeded  by  50  per  cent,  of  the  fourth  grade 
papers. 

The  10  per  cent,  of  the  children  making  the  highest  scores  in  the 
fourth  grade  were  rated  39.1  Hillegas  or  better;  in  the  eighth  grade, 
60.0  HUlegas  or  better;  in  the  twelfth  grade,  700  Hillegas  or  better. 


THE  GARY  SCHOOLS 


E    1 
9    I 


ES 

l~»aass* 

8 

I 

"6SS8S8a 

Is 

° 

a 

i*sa-«l  i 

SgS 

s 

~333» 1 1 1 

3S3 

2 

I-83B-I 1 

s£s 

- 

I—9BH-I 

S^s 

So 

i-asaai i 

S3* 

- 

|  |  |S8SS~ 

8gs 

. 

I  |  |"»S83 

S3* 

. 

1 1— asss 

82* 

* 

I  ris|sa 

«■ 

a 

E 

B 

g 

1 

SRSSSSS3 

3 

1 

§11 


1*1 

*i 

Ill 

;« 


III 
in 

JJfj 

mi 


ENGLISH  COMPOSITION 

Ficoke  37 
Development  in  Composition 


COMPOSITION. 


REPRODUCTION 


ENGLISH  COMPOSITION 

DEVELOPMENT  CURVE 

SPEED -QUAIITY 


:ale  along  the  base  of  the  figure — rate  in  number  of  words  written  per 
ute.  Scale  along  the  vertical  axis— quality  on  the  Hillegas  Scale. 
1  liue  composition;  dotted  line— reproduction  of  simple  story  alter 
reading.  Small  circles  on  composition  curve  indicate  position  of 
le  scores  in  both  rate  and  quality.  The  small  figures  near  circles 
r  grades.  The  reproduction  curve  is  baaed  upon  the  actual  rates  of 
oduction  and  the  quality  of  the  composition  teat.  The  eighth  grade 
ity  of  reproduction  was  judged  to  be  equal  to  that  of  composition, 
the  dotted  line,  as  a  whole,  represents  a  theoretical  line.  The  curve 
drawn  to  make  evident  the  actual  difference  in  rate  of  writing  be- 
:n  reproduction  and  composition.  The  difference  increases  from 
le  to  grade  and  the  curves  show  that  quality  in  the  Gary  schools  is 
luced  at  the  expense  of  rate. 

ote  that  the  light  dotted  line  which  represents  the  actual  scores  for 
slopment  of  ability  in  English  composition  begins  to  rise  rapidly  in 
sixth  grade,  and  that  the  growth  during  high  school  years  is  almost 
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Figure  37— Continued 

entirely  in  quality.  To  conform  to  results  in  conventional  schools  the 
composition  curve  should  have  the  same  general  form  as  the  curve  for 
reproduction  and  fall  half  way  between  the  two  curves  in  the  figure. 

between  the  fourth  and  sixth  grade,  but  the  change 
from  the  sixth  to  the  eighth  grade  is  13  points.  The 
eighth  grade  median  score  is  nearly  46  Hillegas.  The 
ninth  grade  is  but  slightly  higher,  but  the  tenth  and  elev- 
enth grade  scores  raise  the  level  18  points.  The  twelfth 
grade  score  is  lower  than  that  of  the  eleventh  grade. 
That  is,  of  the  34.3  points  of  difference  in  quality  between 
the  fourth  and  eleventh  grades,  30.2  points  gain  is  made 
in  four  of  the  grades.  The  growth  in  high  school  grades 
is  almost  wholly  in  quality.  (Tables  LIU,  LIV,  pages 
224,  227,  Figure  37,  page  225). 

Teachers  of  English  hold  that  in  compositions  there 
should  be  increasing  freedom  from  error  from  grade  to 
grade,  and  increasing  power  both  to  choose  the  words 
best  adapted  to  the  expression  of  a  given  thought  and  to 
organize  the  words  chosen  into  coherent  discourse.  Ac- 
cordingly, the  eighth  grade  papers  were  subjected  to  a 
series  of  analyses  to  determine  the  number  and  character 
of  the  various  errors  made.  Papers  were  marked  for 
errors  in  capitalization,  punctuation,  spelling  and  gram- 
mar and  were  also  analyzed  as  to  range  of  vocabulary. 

On  the  average,  a  Gary  eighth  grade  child  makes  .8 
of  an  error  in  capitalization,  1.8  errors  in  punctuation, 
and  3.9  errors  in  grammar,  or  a  total  of  6.4  errors  in  writ- 
ing an  original  composition  of  214  words  (Table  LV,  page 
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228).  -  When  it  is  remembered  that  the  errors  selected 
for  scoring  include  gross  errors  only,  it  will  be  seen  that 
the  Gary  product  in  its  mechanical  aspects  is  not  very 
satisfactory. 

An  important  phase  of  school  training  is  the  develop- 
ment of  an  adequate  vocabulary.  Fifty-three  per  cent 
of  the  different  words  and  86  per  cent,  of  the  running 
words  used  by  the  eighth  grade  children  in  Gary  are  classi- 
fied by  Jones  as  second  grade  words  (Table  LVI,  page  231, 
Figure  38,  page  232).  The  Gary  results  include  every 
word  used  in  the  127  eighth  grade  compositions  analyzed, 
while  the  Jones  vocabulary  includes  only  words  used  by 
at  least  2  per  cent,  of  the  children  of  a  given  grade.  If  the 
Gary  vocabularies  had  been  restricted  to  different  words 
used  at  least  three  times,  the  percentage  of  the  second  j 
grade  words  would  have  been  seventy-five.1 

Again  a  careful  study  of  the  vocabulary2  fails  to  show 
any  clear  effect  of  the  special  training  at  Gary.  For 
instance,  the  word  "carbon"  is  used  three  times,  but  a 
reading  of  the  composition  in  which  it  occurs  shows  that 

Comparative  data  from  conventional  schools  are  not  available.  How- 
ever, a  tabulation  of  a  random  sampling  of  all  the  words  used  in  the 
eighth  grade  compositions  gave  54  per  cent  of  the  words  five  letters  or 
less  in  length.  A  similar  random  sampling  of  Jones'  second  grade  words 
gave  45  per  cent,  five  letters  or  less  in  length.  That  is,  the  Gary  vocab- 
ulary contained  a  larger  proportion  of  the  simpler,  smaller  words.  See 
also  page  414. 

'  *In  Appendix  A,  VII,  page  454,  will  be  found  a  list  of  all  the  eighth 
grade  words  which  are  not  either  proper  nouns,  words  in  Jones'  List,  or 
derivations  of  those  words.  That  is,  it  contains  all  words  which  might 
in  any  way  be  peculiar  to  Gary. 
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the  word  was  acquired  at  the  time  the  boy  was  in  a  bootk 
with  his  brother,  a  moving-picture  operator.  ("The 
red  hot  carbon  fell  out  of  the  lantern  and  set  fire  to  the 
film/9)     Similarly,  in  one  case,  "auditorium"  refers 
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not  to  school  work,  but  to  the  name  of  a  theatre  in  Chi- 
cago; "pottery"  to  Indian  pottery  seen  on  a  trip  to  New 
Mexico.  One  of  the  few  possible  exceptions  is  a  set  of 
words,  "mummy,  art,  beetles,"  etc.,  which  are  used  in 
describing  a  trip  to  the  Art  Museum]  in  Chicago.  The 
trip  was  taken  with  nine  other  boys  from  Gary  and  may 
have  grown  out  of  school  work,  although  no  reference  h 
made  to  teacher  or  school.  *  As  far  as  it  is  ptwrdMr  to 
judge,  the  various  words  which  do  not  appear  in  Jones 
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TABLE  LVH 
Composition  Subjects — Eighth  Grade  Classes 

Total  number  of  papers 127 

Analysis  A 

RELATIVE  TO  LITE  OF  CHILD 

Events  in  the  life  of  the  writer  (exciting) 49 

Descriptions  of  scenes  or  accounts  of  experiences  (not  exciting). ..  30 

Accounts  of  incidents  observed  in  the  life  of  others  (exciting). ...  19 

Description  of  trips. 13 

Accounts  of  experiences  related  by  others  (not  seen) 11 

Dreams,  ghost  stories,  and  imaginary  events 4 

Experiment  in  physics. 1 

Total  127 

Analysis  B 

TYPES  OF  EXPERIENCES 

Accidents,  runaways,  and  collisions 23 

Experiences,  fishing,  swimming,  walking,  and  skating 22 

Trips 20 

By  boat  10 

,f  rail     6 

"  auto  4 

Hikes  in  woods  or  country 12 

Storms 18 

Rain   11 

Snow    5 

Hafl      2 

Fires 6 

Errands 5 

Miscellaneous j  21 

127 

Analysis  C 

SOURCE 

Farm,  country,  or  woods 80 

City 24 

Rivers,  lakes,  or  ocean 23 

A  trip  of  some  kind. 15 

Home 12 

School 7 

On  way  home  after  school 7 

Miscellaneous 9 

27 


4 
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may  as  wel  hare  been  acquired  from  actual  life  experi- 
cus  a?  innn  school  activities. 

Tht  wirtrms  chosen  as  subjects  by  the  127  children 
ir^grftr  in  Ac  eighth  grade  classes  were  carefully  studied 
acne  Tahnlaw3  Table  lAH,  page  233).  The  children 
Mwra?  insrructians  and  wrote,  for  the  most  part,  about 
snnpifc.  rtiilffhh  interests  in  striking  occurrences  of  daily 
StL  Tberc  is  little  in  them  to  show  that  the  interpreta- 
tion placed  xipan  three  experiences  by  the  children  has 
besa:  TrrfhirniTd  in  any  way  by  sdiool  training. 

SCHOOL  TO  SCHOOL  COMPARISON 

Ix  TTifcVnj:  camparisoos  from  sdiool  to  school,  marked 
diScr-aces  were-  found.  Of  the  13  classes  tested  in 
lie-  Jcrcrsac  School  2  had  composition  scores  markedly 
aScw  lie  d:y  scones  and  one  below.  Of  the  12  classes 
in  the  Bcvoiiy  School  none  was  above  the  dty  score 
and  7  beicw  JTable  LYIH,  page  235).  Comparisons  on 
the  basis  of  composition  rate,  or  rate  of  reproduc- 
tion, yield  results  which  are  very  similar.  The  schools 
in  order  of  rank  are:  Jefferaon,  Emerson,  Froebel,  and 
Beveridge-  It  is  probable,  however,  that  these  dif- 
ferences are  not  in  any  way  due  to  special  program  fea- 
tures. If  the  differences  observed  were  due  to  the  en- 
riched curriculum,  the  order  of  schools  would  probably  be: 
Emerson,  Froebd,  Jefferson,  and  Beveridge;  Froebel 
being  put  second  instead  of  first  to  allow  for  the  difficulty 
in  language  work  with  the  children  of  foreign  born  par- 
ents.   Under  the  circumstances,  the  differences  shown 
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in  the  table  are  probably  not  significant  from  the  point 
of  view  of  this  investigation. 

COMPARATIVE  DATA 

The  question  of  the  value  of  the  Gary  product  as  com- 
pared with  similar  work  in  other  schools  cannot  be  set- 
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TABLE  LJX 
n  Quaijty  or  Cohfootioh  [HillbuiI- 


1 

I 

1 

3, -2 

k-3 

ji 

1° 

J* 

£* 

Sl*|l| 

| 

4 

29H 

71 

8 

45.8 

55 

41 

46 

MS 

53 

Mail- 

d  Irora  Butte  ind  Silt  Lilt  City. 

t  IboH  printed  in  tit  Silt  Luke  City  Sutler  Kq 
mike  that  value*  compinhle  with  Garj  tncj  A 
computed  in  i  peculiar  manner.    See  page  147. 


This  table  should  be  read  as  follows:  The  median  fourth  grade  scon 
in  quality  of  composition  at  Gary  was  ao.o  Hillegas.  Trabuc's  fourtl 
grade  standard,  35 ;  Starch  standard,  a6;  fourth  grade  score  at  Butte,  2j; 
Salt  Lake  City,  20;  Nassau  County,  38;  Mobile,  Alabama,  33;  MobQt 
County,  32;  South  River,  N.  J.,  13;  Chatham,  N.  J.,  vj. 

The  scores  of  the  eighth  grade  classes  were  as  follows:  Frocbel,  class 
45,  40  Hillegas;  class  46,  40.  Emerson,  class  14,  47.2;  class  15,  44A 
Jefferson,  class  18,  50. 
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Comparative    Development 


QUALITY    8V  HILLEGA3  SCALE* 


The  scale  along  the  bottom  of  the  figure  represent)  grades,  the  scale 
along  the  vertical  axis  represents  quality  by  the  HUlegas  Scale.  The 
solid  line  represents  Gary;  broken  line,  Butte;  dotted  line,  Salt  Lake 
City. 

The  Gary  results  are  better  than  those  from  Butte  and  not  so  good  as 
those  from  Salt  Lake  City.  The  Gary  arid  Salt  Lake  City  curves  are  the 
same  to  the  fifth  grade,  but  the  value  of  the  scores  in  Salt  Lake  City 
increases  more  rapidly  than  in  Gary,  so  that  by  the  eighth  grade  the 
Gary  scores  are  about  two  years  behind.  Note  the  increased  rate  of 
growth  in  the  high  school  grades.  For  comments  on  the  reliability  of 
scoring  by  the  HUlegas  Scale  see  Section  a  of  this  Chapter,  page  239. 
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tied  as  definitely  as  for  other  subjects,  because  few  com- 
parative data  are  available,  and  studies  of  the  reliability 
of  scoring  by  the  Hillegas  Scale  are  conflicting.  The 
Gary  eighth  grade  score  (45.8  Hillegas)  is  higher  than 
the  corresponding  Butte  score  (41  Hillegas)  and  lower 
than  those  given  in  the  Salt  Lake  City  Survey  (54  Hille- 
gas) (Table  LIX,  page  236,  Figure  39,  page  237). 

In  making  the  Denver  and  Grand  Rapids  surveys  the 
Willing  Composition  Scale  was  used.  The  values  of  this 
scale  do  not  correspond  to  those  of  the  Hillegas  Scale,  but 
through  the  kindness  of  Mr.  Willing  all  but  one  of  the 
eighth  grade  classes1  were  scored  by  him  personally,  so 
that  the  Gary  results  might  be  directly  comparable  with 
the  Denver  and  Grand  Rapids  scores.  The  median  qual- 
ity of  the  Denver  eighth  grade  papers,  written  on  the 
same  subject  and  under  the  same  conditions  as  the  Gary 
papers,  was  63.5 ;  of  the  Grand  Rapids  papers,  65.0;  of  the 
Gary  papers  scored  by  Mr.  Willing,  61.3.* 

It  is  extremely  probable,  therefore,  that  on  the  basis 
of  such  comparative  data  as  are  available  at  present  they 
should  be  judged  to  be  about  equal  to  the  products  of 
composition  training  in  conventional  schools.* 


1The  fifth  one  was  omitted  because  of  lack  of  time. 

The  median  of  the  Hillegas  scores  of  the  same  07  papers  by  the  Guy 
judges  was  46.3  as  compared  with  45.8  for  the  entire  grade,  so  the  omis- 
sion of  the  one  class  did  not  greatly  influence  the  result. 

The  personal  judgment  of  the  author  is  that  this  conclusion  will  not 
stand  when  more  comprehensive  comparative  data  are  available. 
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§2.    Critical  Discussion 

1  measurement  of  ability  in  English  composition 
its  a  problem  of  greater  difficulty  than  the  meas- 
jnt  of  any  of  the  abilities  previously  discussed, 
actors  which  determine  the  merit  of  a  composition 
hose  which  affect  the  judgment  of  the  scorer  are 
Little  is  known  about  their  relative  value.  The 
try  marking  of  teachers  varies  enormously,  both 
teacher  to  teacher,  or  for  any  one  teacher  from  day 
y,  and  from  sample  to  sample.  Also  the  use  of  a 
asition  scale  has  been  attacked  by  certain  teachers 
glish.  It  is  important,  therefore,  that  the  method 
xking  the  papers  from  the  composition  test  in  the 
y  be  explained  in  detail. 

SCALE  USED 

marks  reported  are  in  terms  of  the  Hillegas  Scale.1 
tunately,  the  purpose  and  value  of  the  scale  have 
so  little  understood  that  the  use  of  the  scale  for 
y  purposes  must  be  justified.  It  must  be  admitted 
ze  that  the  scale  cannot  be  used  effectively  without 
ng.  The  effects  of  a  first  reading  of  the  scale  are 
to  be  irritation  and  a  sense  of  the  impossibility  of 
pting  to  judge  of  the  quality  of  a  composition  by 
nee  to  other  compositions  of  a  totally  different  char- 
Yet  it  is  easy  to  show  that  such  judgments  are 
nly  possible,  but  are  accurately  and  consistently 

±crs  College  Record,  Vol.  13,  No.  4,  page  ax. 
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made,  once  certain  viewpoints  and  experiences  are 
gained. 

The  HQlegas  Scale  provides  for  marking  of  composi- 
tions on  an  absolute  basis.  That  is,  a  mark  given  by 
the  scale  means  just  one  thing,  the  degree  of  merit 
possessed  by  the  composition  as  a  composition,  and  en- 
tirely apart  from  any  consideration  of  the  age  or  grade  of 
the  author,  the  conditions  under  which  it  was  written, 
or  the  purpose  for  which  the  mark  is  given.  That  is, 
50  HQlegas  units  of  composition  are  comparable  in  mean- 
ing to  50  inches  or  units  of  length,  50  pounds  or  units  of 
weight,  or  50  units  of  any  other  quantity  which  may  be 
measured  in  absolute  terms. 

The  reader  unfamiliar  with  the  use  of  scientific  units 
for  the  measurement  of  educational  products  should  see 
that  the  composition  scale  has  for  one  of  its  purposes  the 
bringing  to  light  of  the  very  items  which  teachers'  marks 
conceal.  The  teacher  of  a  fourth  grade  class  recognizes 
a  paper  as  an  exceptional  paper  for  the  grade  and  marks 
it,  let  us  say,  95  per  cent.  The  teacher  of  the  eighth 
grade  class  also  assigns  a  mark  of  95  per  cent,  to  a  paper 
from  her  class.  Numerically,  the  two  papers  are  equal; 
yet  both  teachers  would  agree  that  one  paper  is  better 
than  the  other.  The  marks  have  served  the  teacher's 
purpose,  but  from  the  point  of  view  of  a  survey  they 
conceal  the  most  important  elements  that  the  survey 
wishes  to  reveal,  i.e.,  the  real  quality  of  the  compositions 
and  the  amount  of  progress  which  has  been  made  from 
the  fourth  to  the  eighth  grades.    If,  however,  the  two 
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papers  are  marked  in  terms  of  the  scale  and  one  is  found 
to  be  of  quality  40,  while  the  other  is  of  quality  80,  it  is 
possible  to  say  at  once  that  the  one  has  twice  as  much 
merit  as  the  other. 

Many  persons  will  admit  the  desirability  of  absolute 
marking,  but  do  not  believe  that  the  scale  can  be  used 
[or  such  a  purpose.  It  is  easy  to  prove,  however,  that 
everyone  recognizes  gross  differences  in  general  merit. 
[f  the  reader  will  scan,  superficially  or  carefully,  the  two 
samples  below  (taken  from  the  papers  written  at  Gary) 
te  will  have  no  difficulty  in  recognizing  that  as  samples 
>f  English  composition  one  represents  a  more  advanced 
itage  of  development  than  the  other. 

sample  1 

"One  the  day  I  was  on  a  wonderful  jourray.  I  travelled 
n  the  mountains  and  as  I  travelled  about  Two  days  and 
iccident  happen  to  me  because  I  slipped  of  the  rock  and 
lurt  my  foot.  And  so  I  went  on,  and  seen  a  bear  behind 
ne  and  I  started  to  run  away.  And  as  I  ren  I  fell  into 
the  water  and  then  the  bear  disappeared.  And  so  I 
went  home  very  lately. 

"Next  morning  I  was  traveling  in  the  different  parts. 
When  I  was  about  in  the  middle  of  the  forest" 

sample  2 

"Ice,"  called  George,  a  jolly,  big,  fat  negro  as  he 
altered  the  yard  with  his  wagon  and  mules.  Today 
he  ice  wagon,  however,  was  to  be  the  conveyance  for  a 
>icnic  party  and  they  were  to  traverse  the  beautiful 
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solitary  roads  in  the  mountains  of  North  Carolina.  There 
was  a  grand  scramble,  the  picnicers  climbed  in,  seated 
themselves  on  the  hay,  laughing  and  talking  until  the 
gloominess  of  the  woods  and  the  stony,  socalled  roads 
attracted  their  entire  attention." 

These  two  samples,  however,  constitute  a  rough  scale 
by  which  any  other  samples  may  be  measured;  for  if  one 
reads  Sample  3  (below)  he  will  have  no  difficulty  in  recog- 
nizing that  it  is  intermediate  in  value  between  the  other 
two. 

SAMPLE  3 

"This  is  a  real  experience.  It  happened  at  Miller's 
beach  in  the  summer  of  1913. 

"My  brother,  another  young  man,  and  myself  went 
out  in  a  row  boat. 

"When  we  had  gotten  about  half  a  mile  from  the  shore 
my  brother  dived  off  of  the  boat.  He  came  up  once  and 
went  down  again,  he  came  up  again  and  went  down  again 
he  came  up  again  and  went  down  again.  We  waited 
awhile  but  he  did  not  come  up. 

"Our  friend  dived  off  after  him  and  after  some  difficulty 
located  him. 

"He  brought  him  up  out  of  the  water  and  after  they 
had  struggled  awhile  knocked  him  senseless." 

Suppose  that  more  and  more  samples  were  thus  read 
and  assigned  a  place  in  relation  to  the  samples  previously 
read.  It  is  evident  that  the  time  would  come  when  the 
difference  from  sample  to  sample  would  be  so  slight  that 
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it  would  be  difficult  to  make  judgments  with  certainty, 
just  as  it  is  impossible  to  tell  with  the  eye  alone  whether  a 
given  bar  is  3.65  inches  long  or  3.66  inches.  The  series 
of  samples  as  a  whole  would  give  an  illustration  of  every 
type  of  sample  from  the  worst  to  the  best. 

The  Hillegas  Scale  provides  a  series  of  selected  samples 
whose  values  have  been  determined  by  a  statistical 
procedure  which  enables  the  results  to  be  expressed 
in  units  according  to  a  consistent  plan.  The  scorer 
may  compare  a  given  specimen  with  the  scale  and  de- 
termine its  value  from  the  values  of  the  scale  samples 
between  which  it  is  judged  to  fall,  and  this  can  be  done 
accurately  and  consistently.1 

ANALYSIS  OF  SCALE 

This,  however,  is  the  point  at  which  many  stumble, 
for  misunderstanding  easily  arises.  Many  see  in  the 
scale  only  a  series  of  distinct  compositions  differing  in 
content  and  style.  They  do  not  generalize  from  these 
compositions  and  carry  in  mind  a  general  concept  of 
"  progress-in-English-composition-as-a-whole  "  of  which 
the  samples  are  merely  particular  illustrations.  Yet  the 
fact  of  such  progress  is  self-evident.  The  average  child 
entering  the  first  grade  cannot  express  his  thoughts  in 
writing,  and,  naturally,  when  he  begins  to  do  so,  his 
attempts  are  very  imperfect.    After  twelve  years  of 


Values  of  the  samples  above  in  terms  of  Hillegas  Scale:  Sample  1, 
one  judge,  35.  Sample  2,  average  of  two  judgments,  75.  Sample  3, 
median  of  five  judgments,  48.    Average  deviation  2.6. 
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training,  however,  a  high  level  of  ability  may  be  reached, 
and  the  progress  from  the  lower  to  the  higher  level  fol- 
lows certain  general  tendencies  which  are  represented 
objectively  in  the  samples  of  the  scale.  It  becomes 
important,  therefore,  to  express  these  gradations  of  de- 
velopment in  their  generalized  form  and  to  use  the 
samples  of  the  scale  only  as  an  aid  to  judgment  in  de- 
termining the  precise  value  to  be  assigned  a  given  com- 
position. 

The  writer's  generalization  of  the  Hillegas  Scale  is  as 
follows :  The  development  of  ability  in  English  composi- 
tion passes  through  three  phases.  At  first  there  is  the 
struggle  to  master  the  mere  mechanics  of  expression, 
the  spelling  of  words  and  their  arrangement  in  the  con- 
ventional order.  Then  follows  a  second  stage  in  which 
the  efforts  of  the  person  writing  are  expended  mainly 
upon  organization  of  subject  matter,  the  selection  of  the 
details  to  be  expressed  and  their  organization  into  con- 
nected discourse.  Finally  there  is  the  stage  of  develop- 
ment of  literary  merit  in  which  choice  of  words,  and  the 
selection  and  organization  of  subject  matter  cease  to  be 
mechanical  and  become  artistic  (Figure  40,  page  245) .  Of 
course,  no  hard  and  fast  lines  can  be  drawn  between  one 
phase  of  ability  and  another,  but  the  main  characteristics 
of  each  of  the  three  stages  of  development  are  well 
marked  and  serve  to  fix  the  limits  of  value  within  which 
a  given  composition  must  fall.  In  Appendix  A  will  be 
found  illustrations  of  samples  of  each  type,  chosen  from 
the  Gary  compositions  on  the  basis  of  the  judgments  of 
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ry  judges.    Reference  to  this  series  of  samples 
in  themselves  constitute  a  composition  scale) 
ble  any  reader  to  determine  for  himself  exactly 
composition  of  a  given  value  is  like. 

Figure  40 
Analysis  op  the  Hillegas  Composition  Scale 
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Well   organized,   but 
common    place    in 
content. 

Exceptional  content 
and  quality. 

lalysis  should  be  read  as  follows:  If  a  composition  because 
is  errors  in  mechanics  proves  difficult  to  read  and  understand, 
the  first  division  of  the  scale  (0-30).  If  the  meaning  is  not 
r  repeated  attempts  to  decipher  the  composition,  the  value 
t  should  be  between  o  and  9  points,  depending  on  the  amount 
1  be  read.  If  the  meaning  is  decipherable,  but  with  difficulty, 
y  should  be  rated  from  10  to  19  points,  and  so  on. 
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TRAINING  OF  SCORERS 

The  five  judges  who  scored  the  eighth  grade  samples 
at  Gary  had  had  some  experience  in  the  use  of  the  scale, 
but  only  one  (the  writer)  was  convinced  that  scoring 
by  means  of  a  scale  yields  constant  and  reliable  results. 
Accordingly,  the  first  work  of  scoring  was  the  training  of 
these  judges  in  the  use  of  the  scale.  The  samples  of  the 
Hillegas  Scale  were  cut  apart  and  given  to  the  scorers,  one 
at  a  time,  to  be  arranged  in  order  of  merit,  as  was  illus- 
trated in  the  case  of  samples  i,  2,  and  3,  pages  241  and  242. 
The  differences  between  the  first  samples  given  out  for 
comparison  were  made  very  large  in  order  that  the  judg- 
ments of  the  scorers  might  agree.  This  established  the 
fact  that  the  judges  were  able  to  recognize  gross  differences 
in  merit.  Little  by  little,  however,  the  amount  of  difference 
from  sample  to  sample  was  decreased  until  the  limits  of 
discrimination  of  the  judges  was  reached.  A  discussion 
of  the  reasons  for  the  relative  positions  assigned  the 
samples  by  each  judge,  and  of  the  characteristics  of  the 
samples  themselves,  then  followed  until  some  agreement 
had  been  reached  as  to  the  basis  on  which  judgment  was 
to  be  made.  Finally,  a  certain  amount  of  practice  scor- 
ing was  done  on  samples  whose  values  are  known,  using 
those  given  in  the  Butte  and  other  surveys,  and  in  the 
articles  on  measurement  of  English  composition  which 
have  been  published  from  time  to  time.1  In  this  way 
the  basis  of  making  judgments  was  soon  standardized. 

'At  this  time  the  standard  samples  issued  by  Professor  Thorndike 
were  not  available. 
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SCORING    OF   PAPERS 

When  it  had  been  determined  that  the  judges  could  use 
the  scale  consistently,  the  scoring  of  the  Gary  composi- 
tions began.  The  eighth  grade  papers  were  scored  by  all 
the  judges,  but  two  judges  did  most  of  the  scoring  of  the 
papers  of  other  grades,  each  class  being  scored  by  a  single 
judge.  At  intervals,  however,  these  judges  re-scored 
certain  classes  to  make  sure  their  standards  were  not 
changing.  Also,  as  early  as  possible  a  set  of  compositions 
(mainly  those  given  in  V,  Appendix  A)  was  chosen  from 
the  Gary  papers  to  form  the  Gary  scale,  and  in  cases  of 
doubt  papers  were  referred  to  both  the  Gary  Scale  and 
the  Hillegas  Scale.  Most  of  the  judges,  however,  pre- 
ferred to  use  the  Hillegas  Scale  rather  than  the  scale  of 
uniform  material  derived  from  it. 

Each  paper  was  assigned  the  mark  which  expressed 
the  judgment  of  the  scorer  as  to  its  true  value,  as  37, 
39,  43,  etc.  In  this  respect  the  practice  at  Gary  differed 
from  that  sometimes  followed  of  giving  each  paper  the 
value  of  the  scale  sample  it  most  resembled.  Thus  in 
the  Butte  survey,  compositions  were  marked  at  "o,  1, 
2,  etc.,  according  to  which  one  of  the  printed  compositions 
they  thought  it  most  like."  A  similar  practice  was  fol- 
lowed in  the  Salt  Lake  City  survey. 

RELIABILITY  OF  SCORING 

A  study  was  made  of  the  variations  in  judgments  in 
scoring  the  eighth  grade  samples.    For  instance,  in  one 
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Fiocsx  41 
Variations  in  Quality— Based  on  Table  LX 

ENGLISH  COMPOSITION  -VACATION  IN  SCORING 
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Figuse  41— Continued 

The  diagrams  at  the  left  hand  side  of  the  figure  represent  individual 
papers.  The  numbers  11,  13,  7,  24,  and  30  refer  to  the  numbers  of  the 
papers  in  Table  LX.  Letters  A,  B,  C,  and  D,  refer  to  the  different  judges. 
Class  indicates  values  for  class  as  a  whole.  Each  light  line  in  the  dia- 
gram represents  the  value  assigned  the  sample  by  one  judge.  Each  heavy 
line  represents  the  value  adopted  as  the  true  value  of  the  sample.  The 
scale  along  the  top  of  the  figure  represents  quality  on  the  Hillegas 
Scale.  The  diagram  shows  that  for  the  class  as  a  whole  the  class  scores 
as  determined  from  the  scores  from  each  judge,  except  A,  agree  closely 
with  the  value  as  determined  from  the  combined  score  of  the  five  inde- 
pendent judgments. 

The  diagrams  marked  1 1  and  13  represent  the  scores  of  samples  upon 
which  there  was  close  agreement  between  the  five  judges.  Paper  7  rep- 
resents the  scores  for  a  sample  in  which  Judge  A  alone  showed  a  wide 
variation.  Papers  24  and  30  represent  samples  upon  which  there  was 
little  agreement  between  the  different  judges. 

The  diagrams  on  the  right  hand  side  of  the  figure  show  the  distribution 
of  the  deviations  from  the  score  adopted  as  the  true  value  of  the  samples. 
The  scale  at  the  top  shows  the  magnitude  and  quality  of  the  deviations. 
Figures  written  in  the  diagram  just  above  the  base  line  show  for  each 
distribution  the  number  of  variations  of  a  given  type.  The  diagram  for 
Judge  A  is  to  be  read  as  follows: 

Out  of  30  papers  Judge  A  gave  6  scores  which  were  from  10  to  14 
points  larger  than  the  real  value  of  the  paper,  n  scores  from  which  were 
from  5  to  9  points  higher,  10  scores  which  were  from  exactly  the  same  or 
within  4  points  higher  or  lower,  and  1  score  each  of  value  5  to  9, 10  to  14, 
or  1 5  to  19  points  lower  than  the  score  adopted  as  the  true  value.  Note 
that  Judge  A  has  a  tendency  to  score  papers  too  high,  while  Judge  B  has  a 
tendency  to  score  papers  too  low.  Note  also  that  in  the  distribution  of 
the  class  as  a  whole  the  distribution  is  symmetrical  about  the  zero 
point,  and  that  out  of  1 50  deviations  73  are  less  than  5  points. 
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eighth  grade  class  (Table  LX,  page  248,  Figure  41,  page 
250),  while  individual  judges  differ  widely  on  certain 
samples  (see  numbers  3,  24,  29,  and  30),  yet  the  scores  as 
a  whole  show  close  agreement.  Of  the  150  individual 
ratings  given  by  the  five  judges  on  the  thirty  papers  of 
this  class,  46,  or  3 1  per  cent. ,  agree  exactly  with  the  median 
score;  56  more,  or  37  per  cent.,  do  not  differ  more  than  5 
points,  or  half  a  step  of  the  scale;  and  only  10,  or  7  per 
cent.,  of  the  judgments  differ  more  than  one  step  from 
the  median  values.  Fifty  two  deviations  are  positive, 
and  52  negative;  that  is,  variations  are  as  likely  to  be 
toward  high  scores  as  toward  low,  and  in  a  large  num- 
ber of  scorings  cancel  each  other. 

The  effect  of  this  neutralization  of  errors  is  seen  in  the 
class  scores.  The  actual  median  score  of  the  class  as  a 
whole,  as  determined  by  the  scores  of  each  of  the  five 
judges,  is  practically  the  same  as  the  class  score  as  de- 
termined from  the  combined  scores  of  the  judges  (40). 
For  three  of  the  judges,  the  values  are  identical  (40). 
The  other  two  judges  differ  less  than  one  step  of  the  scale 
(Judge  A,  47.5,  Judge  E,  41.5).  Even  if  the  computed 
medians  are  used,  the  differences  in  no  case  amount  to 
more  than  one  step  of  the  scale.  That  is,  the  score  of 
the  class  determined  by  the  scores  of  a  single  judge  are 
not  likely  to  vary  at  most  more  than  10  points  from  the 
true  value. 

The  average  deviation  for  all  samples  and  judges  is 
4.5  points.  For  individual  judges,  the  values  range 
from  5.8  points  to  3.6  points.    Therefore,  the  uncer- 
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tainty  of  the  scoring  by  means  of  the  scale  amounts,  on 
the  average,  to  about  half  a  step. 

This  conclusion  is  borne  out  by  the  combined  results 
for  the  127  eighth  grade  papers.  The  average  deviation 
of  635  judgments  was  5.5  points  (median  5.3).  Thirty 
per  cent,  of  all  the  judgments  agreed  exactly  with  the 
combined  scores,  while  18  per  cent,  additional  fell  within 
5  points.  Seventy  five  per  cent,  of  all  the  deviations 
were  less  than  10  points. 

The  agreement  of  the  median  class  scores  as  deter- 
mined from  the  scores  of  each  judge  with  the  class  score 
determined  from  the  combined  judgments  is  close.  In 
three  cases  only  (Judge  A)  are  they  larger  than  10. 
Excluding  Judge  A,  the  average  is  3.5  points.  For 
Judge  A  the  average  is  10.5  points.  Therefore,  the 
classes  were  scored  consistently  and  the  results  may  be 
taken  as  correct  within  10  points.  The  chances  are  about 
even  that  a  re-scoring  of  the  papers  by  a  different  set  of 
trained  judges  would  yield  the  same  results  as  those 
given  in  the  tables  within  5  points. 

The  different  judges  proved  to  have  persistent  biases. 
Judge  A,  for  instance,  had  a  tendency  to  overestimate  the 
quality  of  a  paper,  while  Judge  B  underestimated  values. 
These  personal  biases  persist  through  long  periods  and 
nay  be  determined  and  allowed  for.  As  the  eighth  grade 
scores  are  the  median  of  the  five  judgments,  the  bias  of 
fudge  A  toward  high  ratings  does  not  affect  the  results 
jreatly.  All  other  grades  except  the  eighth  were  scored 
jy  Judges  C  and  D.    The  differences  of  the  class  scores  of 
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the  five  eighth  grade  classes  as  determined  by  the  scor- 
ings of  each  of  these  judges  from  the  same  scores  based 
upon  the  combined  judgments  of  the  five  judges  were,  for 
Judge  C,  o,  s,  o,  3,  5,  7  (average  3  points);  for  Judge  D, 
7,  4,  o,  5,  1  (average  3.4  points).  The  scores  from  grade 
to  grade  are,  therefore,  consistent  and  comparable 
within  the  limit  of  errors  noted. 

METHOD   OF  TABULATION 

In  the  tabulation  of  results  the  composition  scores  were 
grouped  by  tens.  That  is,  the  number  of  papers  having 
a  score  in  the  30^  was  recorded  as  one  frequency,  all  the 
4o's  as  another,  and  so  on.  In  the  Butte  survey,  how- 
ever, the  frequencies  based  upon  the  Hillegas  sample 
values  were  recorded  for  scores  30,  40,  50,  60,  etc.  That 
is,  a  group  of  scores  extending  from  31  to  42  and  center- 
ing at  36.9  was  recorded  as  if  extending  from  30  to  40 
centering  at  35.  The  medians  are,  however,  correctly 
figured  from  the  beginnings  of  the  steps,  and  that  the 
method  distorts  the  results  is  due  wholly  to  the  fact 
that  the  values  of  the  samples  in  the  scale  are  18,  26, 
37,  etc.,  instead  of  15,  25,  35.  In  the  Salt  Lake  City- 
survey  the  same  practice  was  followed,  and,  in  addition, 
in  computing  medians  30  was  taken  as  meaning  values 
from  25  to  35.  The  effect  of  this  is  to  lower  all  medians 
approximately  7  points  from  the  actual  value  of  the  me- 
dian paper.  For  instance,  class  No.  14  Emerson  (eighth 
grade)  made  a  median  score  of  47  Hillegas  as  determined 
by  the  Gary  procedure.     According  to  the  Butte  meth- 
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ds  the  score  would  have  been  46.8,  according  to  the 
alt  Lake  City  methods  38.  The  actual  median  score 
"as  45.  The  Gary  class  scores  may,  therefore,  be  from  2 
>  9  points  higher  than  they  would  have  been  if  a  (Mer- 
it method  of  tabulation  had  been  followed.  The  writer 
elieves,  however,  that  the  method  used  in  this  survey 
:veals  more  precisely  the  true  value  of  the  class  product. 
The  practice  in  other  surveys  is  not  known.  In  view  of 
le  possibilities  of  such  differences,  as  well  as  the  uncer- 
dnty  in  regard  to  standards  adopted  in  scoring,  little 
iliance  can  be  placed  upon  comparative  data.  The 
jader  should  study  the  illustrative  samples  given  and 
iterpret  the  Gary  results  in  the  light  of  these  samples. 
he  use  of  a  scale  has  at  least  served  to  standardize  the 
roring  of  the  Gary  judges  and  to  permit  the  presentation 
[  results  in  terms  which  are  not  ambiguous  in  meaning. 

ERRORS  IN  PUNCTUATION  AND  GRAMMAR 

The  great  difficulty  in  scoring  papers  for  errors  in 
unctuation  and  grammar  is  the  lack  of  a  standard  basis 
>r  determining  errors.  The  rules  of  punctuation  are 
ot  absolute,  and  the  constructions  to  be  counted  as 
rrors  in  syntax  vary  greatly  with  different  examiners, 
urther,  comparative  data  are  lacking,  so  that  the  re- 
llts  have  little  significance.  It  is  not  enough  to  know 
lat  a  child  makes  a  number  of  errors  in  grammar.  We 
lust  know  also  whether  the  number  of  errors  is  greater 
r  less  than  the  errors  made  by  the  average  child  of  the 
one  grade. 
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In  the  Willing  Composition  Scale  used  in  the  Denver 
survey,  not  only  is  judgment  based  on  general  merit, 
but  die  attempt  is  made  to  grade  the  samples  on  the 
basis  of  the  frequency  of  error  as  well;  thus,  composi- 
tions of  quality  20  have,  on  the  average,  30  mistakes 
in  spelling,  punctuation,  and  syntax  per  100  words 
written.  By  quality  50  the  number  of  mistakes  has  fallen 
to  14  and  by  quality  70  to  8.  Unfortunately,  however, 
no  statement  is  made  as  to  the  type  of  mistakes  which 
were  counted  as  errors. 

For  the  Gary  work,  after  a  careful  study  of  represen- 
tative papers,  it  was  decided  to  limit  the  scoring  to  gross 
errors.  For  capitalization  three  mistakes  only  were 
counted: 

1.  Failure  to  begin  a  sentence  with  a  capital  letter. 

2.  Failure  to  capitalize  a  proper  noun. 

3.  The  capitalization  of  common  noun3. 

In  punctuation,  also,  only  three  errors  were  counted: 

1.  Failure  to  place  a  period  at  the  end  of  a  sentence. 

2.  Failure  to  place  a  question  mark  at  the  end  of  a 

question. 

3.  Failure  to  enclose  a  direct  quotation  in  quotation 

marks. 

In  some  cases  it  was  found  that  where  the  period  had 
been  omitted  at  the  end  of  a  sentence,  the  following 
sentence  was  not  commenced  with  a  capital.  This  was 
not  counted  as  two  errors,  but  recorded  as  the  "period" 
error  only. 
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Eight  types  of  errors  in  syntax  were  recorded : 

1.  The  use  of  the  wrong  case  form,  as  "me  and  him 

went." 

2.  The  use  of  one  word  in  place  of  another,  as  "it 

would  of  (have)  been." 

3.  Lack  of  agreement  of  noun  and  pronoun,  as  "the 

pieces  were  about  the  size — and  it  break." 

4.  Lack  of  agreement  between  subject  and  verb,  as 

"they  was." 

5.  Use  of  the  wrong  tense  form,  as  "seen"  for  "saw." 

6.  Use  of  the  double  negative. 

7.  Confusion  of  dependent  and  independent  clauses, 

as  ".  .  .  away,  but  worst  thing  was  that 
there  were  not  light  on  the  streets  and  no  road 
but  a  little  path  through  the  wood  but  I  dressed 
up  and  took  my  dog  and  started  off  we  were  not 
far  from  home  when  my  dog  his  name  was 
Rover  began  to  chase  after  I  was  a  fright  to  go 
myself  and  began.    .    .    ." 

8.  Omission  of  words  essential  to  the  thought,  or  the 

addition  of  irrelevant  words,  as  "began  to  chase 
after  (a  rabbit,  omitted)  I  was  a  fright  to  go 
myself." 
A  few  gross  errors  were  recorded  which  do  not  fall  under 
any  of  the  headings  given  above,  but  their  number  was 
so  small  (8  per  cent,  of  the  total)  that  it  has  not  been 
thought  necessary  to  list  them  in  detail. 

Each  paper  was  scored  by  two  examiners  independ- 
ently.   There  were  marked  disagreements  between  their 
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TABLE  LXn 

Coefficients  of  Correspondence  Between  Quality  and  a  Number 

of  Specific  Characteristics1 


test 


Quality 

Total  Length 

Different  Words 

Vocabulary  Index 

Spelling  Errors  (Coefficient) 

Errors  in  Grammar  (Coefficient) .... 

Errors    in    Spelling    and    Grammar 

(Total  per  paper) 


median 


43.5 
209 
106 

22.5 
22.5 

10.5 


MEDIAN 

deviation 


4 
54 
24.5 

2.8 

12.5 
152 

6.5 


TOTAL 
RANGE 


36-67 

107-426 

65-161 

35-62 

0-88 

0-129 

<M3 


Percentage  op  Total  Cases  Which  Do  Not  Vary  in  Relattvz 
Position  More  Than  One  Unit  of  Variability,  Where 
Quality  op  Composition  Is  Compared  with 


total 

NUMBER 
OP 
WORDS 

NUMBER 

OF 

DIFFERENT 

WORDS 

VOCABU- 
LARY 

index 

COEFFICIENT 
OF  ERRORS 
IN 

SPELLING 

COEFFICIENT 
OF  ERRORS 

IN 
GRAMMAS 

TOTAL 

MISTAKES 

PER 

PAPER 

33 

33        ' 

33 
16* 

-38 
-30* 

—52 

-44* 

-48 
—36* 

The  coefficient  of  correspondence  between  errors  in  spelling  and  errors 

in  grammar  was  48  per  cent.  (+  .60)*. 

1  Based  on  papers  of  forty  two  eighth  grade  children  present  for  all  tests. 
•Pearson  coefficient  of  correlation. 

This  table  is  to  be  read  as  follows:  If  each  child's  position  in  the  class 
distribution  for  total  length  of  composition  be  compared  with  his  position 
in  the  class  distribution  for  quality  of  composition,  when  both  positions 
are  expressed  in  terms  of  the  median  deviation  of  the  class  as  a  whole, 
33  per  cent,  of  the  children  will  be  found  to  have  maintained  the  same 
relative  position  within  one  unit  of  variability. 
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FIGURE  42 

Degree  of  Correspondence  Between  Errors  in  Spelling  and 

Errors  in  Grammar 


CT.4. 


w  ACTUAL  MISSPELLINGS  »    ACTUAL  ERRORS   IN  GRAMMAR 


The  numbers  along  the  base  of  the  figure  represent  the  42  individuals  of 
an  eighth  grade  group.  The  scale  along  the  left  hand  axis  represents 
units  of  variability  (median  deviation  above  and  below  the  median  of 
the  class).  The  solid  line  represents  variability  ratios  for  errors  in 
spelling.    The  broken  line  represents  similar  ratios  for  errors  in  grammar. 

The  curves  show  that  approximately  50  per  cent  of  the  children 
maintain  the  same  position  in  the  two  distributions  within  one  unit  of 
variability.  That  is,  individual  No.  z  makes  very  few  errors  in  either 
spelling  or  grammar.  Individual  No.  3,  however,  while  at  the  top  of  the 
class  in  accuracy  of  spelling,  is  below  the  median  for  number  of  errors  in 
grammar.  Individuals  Nos.  z  2, 25,  and  39  represent  extreme  deviations. 
Individual  No.  37  represents  a  variant  of  the  opposite  type.  In  other 
words,  he  makes  many  errors  in  spelling,  but  is  exactly  at  the  median  for 
errors  in  grammar. 

The  reader  should  note  that  both  distributions  are  badly  skewed;  the 
range  of  variation  above  the  median  being  a  little  over  z  and  below  the 
median  8.8. 
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scores.  The  attempt  was  made  to  harmonize  the  various 
judgments,  but  it  proved  so  costly  in  time,  and  the  final 
decision  seemed  to  rest  upon  such  uncertain  bases,  that 
in  view  of  the  lack  of  comparative  data  it  was  decided  the 
results  were  not  worth  the  time  and  effort.  Accordingly, 
the  averages  of  the  errors  recorded  by  the  two  exam- 
iners were  taken  as  the  scores  for  each  individual.1 

FACTORS  DETERMINING  MERIT 

The  tabulation  of  the  different  types  of  errors  affords  a 
chance  to  investigate  the  relation  between  general  merit 
in  composition  and  its  various  characteristics — spelling, 
punctuation,  and  grammar.  It  would  appear  that  judg- 
ments as  to  quality  of  composition  are  affected  more  by 
errors  in  spelling  (coefficient  of  correspondence  38  per 
cent.)  and  still  more  by  errors  in  capitalization,  punctua- 
tion, and  grammar  (52  per  cent.)  than  by  the  number  of 
different  words  used  (33  per  cent.)  (Table  LXII,  page  260). 
The  coefficient  of  correspondence  based  upon  total  mis- 
takes is  intermediate  between  those  for  spelling  and  gram- 
mar (48  per  cent.)  (Figure  42,  page  261).  These  coeffi- 
cients mean  that  in  judging  compositions  one  is  influ- 
enced now  by  one  factor  and  now  by  another. 

1 A  complete  record  of  the  scoring  for  Class  No.  45  Froebel  will  be  found 
in  Table  LXI,  page  256. 


VII.  READING 

§i.    General  Results 

THE  Gary  schools  recognize  the  importance  of  read- 
ing by  allotting  to  the  subject  annually  i  ,3  23  hours, 
as  compared  with  1,280  hours  in  the  conventional 
school,  or  26  per  cent,  of  the  time  given  to  fundamentals, 
as  compared  with  24  per  cent,  in  conventional  schools. 

• 

READING   ABILITY 

In  current  practice  the  direct  teaching  of  the  mechanics 
of  reading  rarely  extends  beyond  the  third  grade.  Read- 
ing in  the  higher  grades  passes  over  into  training  in  ex- 
pression, in  understanding,  and  in  appreciation.  Conse- 
quently ability  in  reading  comes  to  have  many  meanings, 
each  derived  from  the  situation  to  which  the  term  is 
applied.    Ability  in  reading  may  mean : 

(1)  Ability  to  recognize  silently  the  general  meaning 

of  words  of  a  given  range  of  difficulty.    (Otis.) 

(2)  Ability  to  "  sound  "  correctly  a  given  set  of  words. 

(Jones.) 

(3)  Ability  to  read  aloud  smoothly  and  with  proper 

expression  (without  regard  to  whether  the  mean- 
ing is  understood  or  not).     (Gray.) 

263 
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(4)  Ability  to  read  either  silently  or  orallyandtounder- 

stand  the  essential  relations  existing  between  the 
essential  elements  of  what  is  read.  (Courtis.) 

(5)  Ability  to  read  either  silently  or  orally  and  tell 

in  one's  own  words  the  substance  of  what  has 
been  read.    (Starch,  Brown,  Gray.) 

(6)  Ability  to  read  instructions  either  silently  or 

orally  and  be  able  to  act  in  accordance  with  the 
instructions.    (Kelly.) 

(7)  Ability  to  read  again  and  again  (study)  until  one 

has  mastered  the  contents  of  a  passage  so  that 
one  can  answer  questions  about  it  or  use  the 
information  in  solving  problems.     (Thorndike.) 

(8)  Ability  to  read  a  passage  and  interpret  the  allu- 

sions which  it  contains. 

(9)  Ability  to  read  a  selection  and  be  stirred  emo- 

tionally by  its  aesthetic  elements. 

(10)  Ability  to  read  a  passage  and  interpret  the  mood, 

ideas,  or  ideals  of  the  author. 

(11)  Ability  to  read  a  selection  and  make  judgments  as  to 

its  style  and  merits  as  a  piece  of  "good  English." 
And  there  are  doubtless  many  other  possible  variations  of 
the  senses  in  which  "ability  to  read"  may  be  understood. 
Unfortunately,  the  makers  of  tests  have  not  given 
much  attention  to  this  phase  of  the  subject.  They  have 
labeled  their  productions  "Tests  of  Reading,"  and  have 
been  content  to  say  explicitly1  or  to  imply  that  by  reading 

teachers  College  Record,  January  1916,  p.  40:  An  Improved  Scale  for 
Measuring  Ability  in  Reading — Thorndike.    "Call  difficulty  for  para- 


READING  265 

ability  is  meant  the  ability  to  complete  this  test  success- 
fully. But  when  several  such  tests  are  given  to  the  same 
children,  as  at  Gary,  one  would  seem  to  have  the  right 
to  expect  that  the  reading  abilities  of  the  children  as 
revealed  by  one  test  will  show  some  agreement  with  the 
reading  abilities  revealed  by  the  next  test,  since  both 
are  tests  of  reading.  This,  however,  is  precisely  what 
the  results  in  general  do  not  show.1 

The  tests  given  at  Gary  simply  reveal  the  character  of 
the  response  made  by  the  children  to  a  number  of  specific 
situations  in  which  ability  in  reading  enters  as  one  ele- 
ment. To  aid  the  reader  in  appraising  the  value  of  the 
results  and  in  deciding  to  what  extent  each  test  measures 
mainly  reading  or  mainly  some  other  abilities,  some  de- 
scription of  the  tests  used  is  necessary. 

MEASUREMENT  OF  ORAL  READING 

In  oral  reading,  the  measurable  elements  are  the  rate 
and  accuracy  of  reading,  and  the  quality  of  expression. 


graph  reading  a  characteristic  of  a  paragraph  and  question  about  the 
paragraph  much  of  which  produces  a  large  percentage  of  wrong  responses 
to  the  questions,  and  little  of  which  produces  few,  the  individuals  con- 
cerned being  the  same. 

"Call  achievement  in  paragraph  reading  that  thing  much  of  which  en- 
ables an  individual  to  respond  correctly  to  a  paragraph  and  a  question 
about  the  paragraph  involving  much  difficulty  for  paragraph  reading 
whereas  an  individual  of  less  achievement  could  respond  correctly  only 
to  a  paragraph  and  a  question  about  the  paragraph  of  less  difficulty." 

1See  Richards  and  Davidson,  School  and  Society,  Vol.  IV,  September 
2,  1916. 
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THE  GARY 
Geary  *  Oral  ^raffing  Scat  was  md  to  measure  the  fast 

Tin*  scale  miwkfr  of  a  imrntipir  of  paragraphs  cadi  i 
nnir  mas  riffiirifh  in  f-fwnwn  vocabulary,  and  structure 
than  Ac  cmr  before  it.  It  s  essential^  a  difficulty  test; 
rhsl  i&.  cadi  ribald  begins  -with  simple  material  well 
ladon  Ins  nrngr  of  ainfitr  and  progresses  through  the 
sctfe  mxnl  iff  readies  Tnntrrial  which  is  so  difficult  that 
be  iaik.  The  time  najulied  to  read  cadi  paragraph  and 
the  Tnkralrs  made  in  reading  them  are  noted.  It  is 
tJms  jm*rihlr  to  irjM.nl  tbe  results  erf  timing  jn  objective 


The  Tnfffian  aiaEry  of  eighth  grade  Gary  children  in 
oral  reading  alien  thus  measured  may  be  inferred  from 
Figure  45.  page  267.  Half  of  tbe  eighth  grade  children 
are  able  to  read  satisfactorily  tbe  sample. paragraph 
there  shown.     (Gray's  Standard  4.)1 

The  range  of  ability  in  tbe  eighth  grade  may  be  com- 
prehended from  tbe  samples  shown  in  Figure  44.  Ap- 
proximately 10  per  cent,  of  tbe  children  are  able  to  read 
paragraph  B  under  the  standard  conditions,  while  about 
10  per  cent,  are  not  able  to  read  paragraph  C  under  the 
standard  conditions.  That  is,  the  abilities  of  about  80 
per  cent,  of  the  eighth  grade  children  fall  between  these 
limits. 

The  abilities  of  the  Gary  children  expressed  in  terms  of 


'A  paragraph  is  accepted  under  Gray's  Standard  4  if  read  without 
more  than  one  mistake,  or  if  read  in  less  than  20  seconds  without  mace 
than  two  mistakes. 
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the  Gray  Scale  are  as  follows:  The  second  grade  child 
of  median  ability  is  just  not  able  to  read  paragraph  1 
under  Gray's  Standard  4,  the  median  third  grade  child 
can  read  paragraph  2,  but  not  paragraph  3,  while  the 
median  eighth  grade  child  can  read  paragraph  8,  but  not 
paragraph  9. 

If  successful  reading  is  taken  as  reading  a  paragraph 
in  less  than  40  seconds  with  not  more  than  six  errors 
(Gray's  Standard  1),  the  median  second  grade  child 
can  read  paragraph  3  but  not  4,  while  the  median  eighth 
grade  child  can  read  paragraph  11  but  not  paragraph  12. 

Figure  43 
Difficulty  of  Material  Represented  by  Eighth  Grade  Score 

SAMPLE  A 

8 

The  crown  and  glory  of  a  useful  life  is 
character.  It  is  the  noblest  possession  of 
man.  It  forms  a  rank  in  itself,  an  estate  in 
the  general  good  will,  dignifying  every  sta- 
tion and  exalting  every  position  in  society. 
It  exercises  a  greater  power  than  wealth, 
and  is  a  valuable  means  of  securing  honor. 

Half  of  the  eighth  grade  children  are  able  to  read  orally  the  paragraph 
above  in  from  20  to  40  seconds  without  making  more  than  one  mistake, 
or  to  read  it  in  less  than  20  seconds  without  making  more  than  two 
mistakes.  (Gray's  Standard  4.)  The  sample  is  paragraph  8  on  Gray's 
Scale. 
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Figure  44 
Variations  in  Eighth  Grade  Ability 

SAMPLE  B 

10 

Responding  to  the  impulse  of  habit 
Josephus  spoke  as  of  old.  The  others  lis- 
tened attentively  but  in  grim  and  contemp- 
tuous silence.  He  spoke  at  length,  continu- 
ously, persistently,  and  ingratiatingly.  Fin- 
ally exhausted  through  loss  of  strength  he 
hesitated.  As  always  happens  in  such  exi- 
gencies he  was  lost. 

sample  c 

3 

Once  there  were  a  cat  and  a  mouse. 
They  lived  in  the  same  house.  The 
cat  bit  off  the  mouse's  tail  "Pray, 
puss, "  said  the  mouse,  "give  me  my 
long  tail  again. 

*  'No, ' '  said  the  cat,  '  T  will  not  give 
you  your  tail  till  you  bring  me  some 
milk. 
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Figure  44 — Continued 

Ten  per  cent,  of  the  eighth  grade  children  are  able  to  read  Sample  B 
satisfactorily  (under  Gray's  Standard  4),1  and  10  per  cent,  are  unable  to 
read  satisfactorily  Sample  C.  That  is,  the  ability  of  80  per  cent,  of 
the  eighth  grade  children  ranges  between  ability  to  read  paragraph  B 
and  paragraph  C. 

Sample  B  is  paragraph  No.  10  on  Gray's  Scale.  Sample  C  is  para- 
graph No.  3  on  Gray's  Scale.  Sample  C  represents  the  median  develop- 
ment of  children  in  the  upper  half  of  the  third  grade  at  Gary. 

(Table  LXIII,  page  270.)  That  is,  the  development 
of  ability  in  oral  reading  at  Gary  is  quite  uniform  from 
grade  to  grade.     (Figure  45,  page  271). 

COMPARATIVE  DATA2 

For  many  who  are  not  directly  engaged  in  teaching, 
the  tables  and  illustrations  given  on  pages  267  and  268 
will  have  little  meaning.  Gray's  Scale,  however,  affords  a 
score  in  points  based  upon  the  difficulty  of  the  para- 
graphs, the  time  taken  to  read  them,  and  the  number  of 
errors  made.  In  points  the  Gary  second  grade  score  was 
27  (Cleveland  42,  Grand  Rapids  44,  St.  Louis  47,  aver- 
age of  23  Illinois  cities  20)  and  the  Gary  eighth  grade 
score  41  (Cleveland  48,  Grand  Rapids  48,  St.  Louis  51). 
(Table  LXIV,  page  272.)  The  development  of  ability  in 
oral  reading  at  Gary  closely  parallels  the  development  of 
other  cities,  but  at  a  level  which  is  about  a  year  below 
the  average  of  other  cities.     (Figure  46,  page  273.) 

*To  meet  Gray's  Standard  4,  a  paragraph  must  be  read  without  making 
more  than  one  mistake,  or  read  in  less  than  20  seconds  without  making 
more  than  two  mistakes. 

*See  page  38. 
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The  differences  in  the  performances  of  the  Gary  chil- 
dren as  compared  with  those  recorded  for  the  average 
children  in  conventional  schools  are  brought  out  by  an 
analysis  of  the  results.  Thus,  to  read  paragraph  4  the 
Gary  eighth  grade  children  required  18.7  seconds  (Cleve- 
land 16.62,  St.  Louis  17.85)  and  made  2.1  mistakes 
(Cleveland  1.24,  St.  Louis  .73).  That  is,  the  Gary  chil- 
dren read  more  slowly  and  make  more  mistakes  than  the 
children  in  Cleveland  and  St.  Louis. 

Silent  reading  was  tested  at  Gary  by  a  Reading  and 
Reproduction  Test,  by  the  Kansas  Silent  Reading  Test, 
and  by  the  Trabue  Language  Scale,  but  it  should  be 
recognized  at  the  outset  that,  because  of  the  difficulty 


TABLE  LXm 

Median  Paragraphs  of  Gray's  Scale  Read  by 

Grades 


the  Different 


grade 

STANDARD  1 

STANDARD  4 

DIFFERENCE 

2 

3.3 

.9 

2.4 

3 

5.4 

2.1 

3.8 

4 

6.8 

3.9 

2.9 

5 

7.6 

4.4 

Z2 

6 

9.0 

5.4 

8.6 

7 

10.7 

6.8 

8.9 

8 

11.3 

8.1 

8.2 

Standard  i.  A  paragraph  is  accepted  under  Gray's  Standard  z  if  read 
in  forty  or  more  seconds  without  more  than  four  mistakes, 
or  if  read  in  less  than  forty  seconds  without  more  than  six 

mistakes. 

Standard  4.  A  paragraph  is  accepted  under  Gray's  Standard  4  if  read 
without  more  than  one  mistake,  or  if  read  in  less  thin 
twenty  seconds  without  more  than  two  mistakes. 
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Figure  45 

Development  in  Oral  Reading  Ability— Gray's  Oral  Reading 

Scale 
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Z 
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0- 

GRADE. 

The  scale  along  the  base  of  the  figure  represents  grades.  The  scale 
along  the  vertical  axis  represents  the  paragraphs  of  Gray's  Scale.  The 
solid  line  shows  the  median  ability  of  each  class  under  Standard  1  in  terms 
of  Gray's  Scale.  Broken  line  shows  the  median  ability  of  each  class  in 
terms  of  Standard  4. 

The  two  curves  show  there  is  a  steady  and  quite  uniform  development 
throughout  the  grades. 

of  the  problem  and  the  lack  of  adequate  tests,  the  meas- 
urements of  silent  reading  at  Gary  are  less  conclusive 
than  are  the  measurements  previously  discussed. 

Silent  reading  is  carried  on  primarily  for  the  reader's 
benefit.  Its  most  important  aspect  is  the  degree  of  com- 
prehension of  meaning,  its  second,  the  rate  of  reading. 
Unfortunately,  however,  silent  reading,  pure  and  simple, 
is  limited  to  the  perception  and  comprehension  of  the 
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TABLE  LXIV 

Ability  in  Oral  Reading — Gray's  Oral  Reading  Scale 

Scores  by  points  according  to  Gray's  methods.  From  each  class  at 
Gary  at  least  ten  children  were  measured  (thirty  children  or  more  from 
classes  in  grades  three,  five  and  seven).  Selections  were  made  on  basis 
of  teacher's  judgment,  the  three  best  readers,  four  average  readers,  and 
the  three  worst  readers  being  chosen. 


GRADE 

2 

3 

346 
36 

4 

5 

6 

7 

219 
42 

8 

Number  of  children 

Gary,  Actual  Average 

102 
27 

126 
39 

297 
39 

134 
41 

52 

41 

23  Illinois  Cities1 

Cleveland2 

Grand  Rapids1 

St.  Louis4 

20 
42 
44 
47 

27 
46 
47 
50 

40 
47 
49 
52 

44 
48 
50 
51 

45 
49 
48 
51 

47!- 

47  48 

48  48 
51  (51 

Studies  of  Elementary  School  Reading  through  Standardized  Tests — Gray.    Page  13a. 
•Studies  of  Elementary  School  Reading  through  Standardised  Tests — Gray.    Page  isi. 
•Grand  Rapids  Survey.     Page  66. 
♦Survey  of  St.  Louis  Public  Schools,  Vol.  II,  p.  1*6. 

This  table  is  to  be  read  as  follows:  In  Gary  102  second  grade  children, 
selected  as  representative  children  from  the  various  classes  tested, 
made  an  average  score  of  27  points  when  measured  with  Gray's  Scale 
according  to  Gray's  directions.  Twenty  three  cities  in  Illinois  made 
an  average  score  of  20  in  the  second  grade,  Cleveland — 42,  Grand 
Rapids — 44,  and  St.  Louis — 47. 

matter  read,  while  any  test  of  comprehension  is  neces- 
sarily based  upon  the  response  an  individual  makes  to 
the  test  situation.  This  brings  into  play  new  factors. 
Accurate  measurement  of  ability  in  silent  reading,  there- 
fore, is  exceedingly  difficult,  and  at  the  time  of  the  Gary 
survey  there  were  no  wholly  satisfactory  tests  of  silent 
reading.  What  is  probably  the  best  silent  reading  test 
of  all,  Thorndike's  Scale  Alpha  2,  it  was  not  practical 
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Figure  46 
City  Wide  Average  Scores  by  Grades — Gray's  Oral  Reading 
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The  scale  along  the  base  of  the  figure  represents  grades.  The  scale 
along  the  vertical  axis  represents  scores  in  terms  of  Gray's  Scale.  For 
methods  of  computing  these  scores  and  of  drawing  the  graph,  see  $  2. 
The  solid  line  represents  the  Gary  results;  dotted  line  the  scores  made 
by  the  Grand  Rapids  pupils.  The  average  difference  between  the  Gary 
and  Grand  Rapids  results  is  7.6  points,  while  the  average  annual  growth 
is  5  points.  The  Gary  scores  are  thus  approximately  a  year  and  a  half 
lower  than  those  from  Grand  Rapids. 

to  use  owing  to  the  conditions  under  which  the  survey  was 
conducted.  However,  several  of  the  conventional  read- 
ing tests  were  given  and  while  the  results  do  not  tell 
the  whole  story,  they  tell  enough  to  show  some  of  the 
important  characteristics  of  the  product. 
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REPRODUCTION  TESTS 

The  simplest  test  of  silent  reading  would  seem  to  be 
the  measurement  of  the  rate  at  which  a  story  is  read, 
and  the  quality  of  a  reproduction  of  the  story.  This 
method  has  been  followed  by  Gray,  Starch,  Brown,  and 
other  investigators.  However,  a  little  reflection  will 
show  that  reproduction  is  determined  more  largely  by 
(i)  memory  and  (2)  ability  in  English  composition  than 
by  ability  to  read  and  understand.1  Hence,  reading  and 
reproduction  tests  were  given  at  Gary  to  determine 
(1)  the  median  rate  of  reading  for  the  different  grades, 
and  (2)  the  stability  of  the  reading  habits  of  the  Gary 
children  as  measured  by  individual  fluctuations  in  rate 
in  successive  tests.2 

The  test  materials  were  interesting  stories  taken  from 
children's  magazines.  A  very  simple  story  was  used  in 
grades  two  to  five,  a  little  more  complex  story  in  grades 
five  to  eight,  and  a  portion  of  an  adult  biography  in 
grades  eight  to  twelve.  Finally  a  fourth  story  (child's] 
was  given  to  grades  four  to  twelve  under  uniform 
conditions.  Thus  all  grades  from  four  to  twelve  were 
measured  at  least  twice,  and  some  three  times.    From 


*For  a  critical  discussion  of  what  reproduction  tests  measure  and  their 
relation  to  silent  reading,  see  §2  of  .this  chapter,  page  314,  and  DC  of 

Appendix  A,  page  443. 

The  tests  were  also  scored  for  quality  of  reproduction  according  to 
the  conventional  plan,  but  as  the  writer  does  not  regard  the  results  as 
significant  except  as  a  confirmation  of  the  conclusions  of  the  chapter 
on  English  composition,  they  have  been  put  in  DC  of  Appendix  A. 
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the  repeated  measurements,  tabulations  were  made, 
both  of  the  rate  of  reading  and  of  the  amount  of  indi- 
vidual variation  in  successive  tests. 

The  median  rate  at  which  the  eighth  grade  children 
read  the  children's  stories  were  201  and  207  words  per 
minute  respectively.  The  median  rate  for  the  more 
difficult  story  was  170  words  per  minute.  A  rate  of 
204  (the  average  of  201  and  207)  words  per  minute  was 
chosen  as  best  representing  the  ordinary  reading  rate  of 
eighth  grade  children  on  material  suitable  for  their 
grade.  The  corresponding  eighth  grade  rate  of  oral 
reading  was  200  words  per  minute  (based  on  paragraphs 
2  and  3  of  Gray's  Oral  Reading  Scale).  That  is,  the 
rates  of  oral  and  silent  reading  were  nearly  the  same. 
(Table  LXV,  page  276.) 

A  comparison  of  the  rates  of  oral  and  silent  reading 
by  grades  shows  that  the  rate  of  oral  reading  is  at  first 
greater  than  that  for  silent  reading  (second  grade  rate 
for  oral  reading  78  words  per  minute,  rate  for  silent  read- 
ing 54)  but  from  the  sixth  grade  on  it  is  the  rate  of  silent 
reading  that  is  the  greater  (sixth  grade  oral  183,  silent 
185).  The  development  of  ability  in  silent  reading  is 
rapid  in  the  lower  grades  (difference  between  oral  and 
silent  reading  for  grade  two  was  20  words  per  minute; 
grade  six,  2  words  per  minute)  but  from  the  sixth  grade 
on  the  two  rates  differ  but  little.    (Figure  47,  page  278.) 

The  curve  for  the  development  of  rate  of  silent  reading 
is  of  the  conventional  type  and  shows  the  usual  well 
marked  tendency  to  reach  a  maximum  at  the  eighth 


t 


THE  GARY  SCHOOLS 


J3 

< 

„ 

g 

H 

g 

£ 

g 

fPfl'"11! 

1 

i 

M 

5 

1 

o 

I 

§ 

sssgsgs 1 1 II 

■s 

* 

1 

.s. 

a  ■  a 

1 

1  =  ' 

-?  CI  O  (O  IQ  00 -w  lO  CJ  N  O     , 

E 

c  ^  - 

SJ 

f  S  ]  !  ' 23 si! Is as  | 

■Hi 

2 

1° 

N 

|  _ 

^ 

? 

11,11  'fslala 

§  s 

5 

£ 

1  1  I88S3  1  1  1  1 

^y 

1 

S 

iji 

8 

SS5SS  1  1  1  1  i  ! 

11 

i  2 

< 

CJCT^lQWt-OOOlO-JM 

5 

-a 
I] 

3-3 


J!    Is. 


111! 
Mi 

affs 

=  =  =!' 


1 1! 
1  Is 

i  11 


II  ill! 


READING  277 

grade  (seventh  grade  198  words,  eighth  grade  204 
words).  The  change  of  this  maximum  to  a  higher  level 
during  high  school  years  (ninth  grade  235)  is  readily 
explained  on  the  basis  of  the  selective  action  of  promo- 
tion to  high  school  grades.  Therefore,  there  is  at  Gary 
continuous  and  well  marked  development  of  the  con- 
ventional type  of  ability  to  read  silently  as  measured  by 
rate  of  reading. 

In  Table  LXV,  page  276,  the  median  individual  varia- 
tion in  rate  of  reading  in  two  trials  of  tests  composed  of 
material  of  equal  difficulty  is  given  as  33  words  per  minute. 
This  ranges  from  39  words  per  minute,  when  the  rate 
of  reading  is  from  10  to  100  words  per  minute,1  to  25 
words  per  minute,  when  the  rate  of  reading  varies  from 
300  to  380  words  per  minute.2  In  other  words,  the 
variability  of  a  child's  performance  decreases  as  the  rate 
of  reading  increases.  Children  who  read  but  60  or  70 
words  per  minute  in  one  test  will,  in  half  the  cases,  make 
a  score  of  105  or  more  words  per  minute  in  a  second 
trial  with  material  of  equal  difficulty,  or  vice  versa, 
but  children  who  read  at  the  rate  of  350  words  per  min- 
ute, in  a  second  test  will,  in  half  the  cases,  not  make 
scores  of  more  than  375  words  per  minute  in  a  second 
trial.  That  is,  for  slow,  immature  readers  the  variation 
at  Gary  is  67  per  cent,  while  for  rapid  readers  it  is  but 
7  per  cent,  of  their  rate.  The  median  rate  of  variation 
for  118  eighth  grade  children  was  20  per  cent.    This 

'Median  of  103  cases,  58  words  per  minute. 
'Median  of  20  cases,  365  words  per  minute. 
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Figure  47 
Comparative  Rates  of  Oral  and  Silent  Reading 
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The  scale  along  the  base  of  the  figure  represents  grades.  The  sole 
along  the  vertical  axis  represents  number  of  words  read  per  minute. 
Oral  reading  is  based  upon  the  rate  of  reading  paragraphs  one  and  two  of 
Gray's  Scale;  story  of  approximately  equal  difficulty.  The  solid  line- 
silent  reading;  the  dotted  line — oral  reading. 

The  curve  for  the  development  of  rate  of  silent  reading  is  of  the  conven- 
tional type  and  shows  a  tendency  to  approach  a  maximum  at  the  eighth 
grade.  The  portion  of  the  curve  for  high  school  years  reaches  a  higher 
level.  This  is  probably  due  to  the  selective  effect  of  promotion  to  high 
school  upon  ability.  The  rate  of  oral  reading  in  the  lower  grades  is 
slightly  higher  than  the  rate  for  silent  reading,  but  in  the  upper  grades 
this  condition  is  reversed.  This  probably  means  that  the  emphasis  of 
instruction  in  the  Gary  schools  on  oral  reading  operates  to  prevent  proper 
development  of  ability  in  silent  reading. 
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indicates  that  there  are  many  eighth  grade  children  whose 
reading  abilities  are  unstable.  Unfortunately,  no  com- 
parative data  are  available  on  this  point  so  that  it 
is  impossible  to  tell  whether  conditions  at  Gary  are 
better  or  worse  than  elsewhere.  This  degree  of  vari- 
ability probably  represents  an  undesirable  characteristic 
of  the  product  of  instruction  in  reading. 

COMPARATIVE  DATA1 

At  Gary  the  eighth  grade  rate  of  reading  orally  para- 
graph 1  on  Gray's  Scale  was  204  words  per  minute, 
while  in  Cleveland  it  was  214  words,  and  in  St.  Louis 
179  words.  For  paragraph  4,  a  more  difficult  paragraph, 
the  rate  at  Gary  was  196  words,  at  Cleveland  220,  and 
at  St.  Louis  205.    (Table  LXVI,  page  280.) 

Similarly,  the  eighth  grade  rate  for  silent  reading  was, 
at  Gary,  204  words  per  minute,  while  the  rate  recorded 
by  Starch  is  240  words,  by  Gray  234  words,  by  Brown 
290  words,  by  Courtis  280  words,  and  by  the  Salt  Lake 
City  survey  209  words.     (Table  LXVII,  page  282.) 

KANSAS  SILENT  READING  TESTS 

The  Kansas  Silent  Reading  Tests  consist  of  a  number 
of  short  paragraphs  each  of  which  directs  the  child  to 
make  some  response,  and  the  accuracy  with  which  this 
is  done  indicates  whether  or  not  he  has  read  the  directions 
understandingly.    As  illustrations  of  their  character  the 

1See  page  38. 
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first  two  paragraphs  of  the  test  for  the  third,  fourth,  and 
fifth  grades  (Test  I),  two  paragraphs  of  a  different  sort 
(numbers  six  and  eight)  from  the  test  for  the  sixth, 
seventh,  and  eighth  grades  (Test  II),  and  two  para- 
graphs of  still  a  different  kind  from  the  test  for  the  ninth, 
tenth,  eleventh,  and  twelfth  grades  (Test  III)  are  shown 
in  Figure  48,  page  283.  From  these  it  will  be  seen  that 
the  tests  cover  a  very  wide  range  of  reading  material  and 
very  different  reading  situations,  yet  in  every  case  the 
response  is  simple  and  is  easily  judged  to  be  right  or 
wrong. 

It  will  be  apparent  upon  inspection  that  while  reading 
enters  as  one  element  of  the  activities  called  for  by  the 
test,  the  other  activities  of  observing,  analyzing,  judg- 
ing, reasoning,  etc.,  form  such  a  large  part  of  the  total 
activity  that  the  tests  may  legitimately  be  considered 
to  be  measures  of  general  intelligence  rather  than  meas- 
ures of  mechanical  skill  in  reading.  In  the  judgment 
of  the  author  of  this  report,  the  tests  afford  a  valuable 
index  of  the  degree  of  development  attained  in  the 
ability  "to  read  and  think  about  what  is  read." 

At  Gary  Test  I  was  given  to  all  grades  from  the  third 
to  the  twelfth,  but  the  time  was  reduced  to  three  minutes 
in  all  grades  above  the  eighth,  and  the  scores  that 
would  have  been  made  in  five  minutes,  the  standard 
time  allowance,  were  computed  by  multiplying  the 
actual  scores  by  %.    Test  II1  was  given  in  grades  four 

'Scores  in  Test  II  have  proved  to  be  slightly  lower  than  scores  in  Tests 
Iandni. 
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TABLE  LXVH 
Comparative  Data  tor  Rates  op  Silent  Reading 


CA1Y» 

STAKE* 

SALT 
LAKE 
CITY* 

GIAY* 

1             COUlTrf 

CIADK 

ACTUAL 

MOWN* 

NOVMAL  'CAKm. 

ADJUSTED 

EEADDfG 

IKADDiC 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

54 
109 
140 
166 
185 
198 
204 
235 
249 
262 
270 

108 

126 

144 
168 
192 
216 
240 

189 

212 
219 
209 

90 

138 
132 
152 
167 
161 
172 

90 
138 
180 
204 
216 
228 
234 

199 
213 
269 
272 
279 
290 

161 

180 
226 
256 
262 

106 
133 
172 
178 
200 

KDn  the  basis  of  uniform  material. 

'Educational  Measurements,  p.  3a.  Material  increases  in  difficulty  from  lower  to 
higher  grades. 

'Report  of  Salt  Lake  City  Survey,  p.  160.    Uniform  material. 

♦Scale  from  diagram  17.  Studies  of  Elementary  School  p«ftifflg  through  Standardized 
Tests— Gray.    Adjusted  rates  are  on  basis  of  uniform  material. 

*  Bulletin  No.  1,  Bureau  of  Research,  Department  of  Public  Instruction,  New  H*xnp» 
shire,  p.  57.    Variable  material. 

•Fourteenth  Yearbook  of  the  National  Society  for  the  Study  of  Education,  p.  5a  Uni- 
form material. 

Rates  in  different  cities  are  based  on  different  materials,  hence  the 
results  are  of  value  only  for  general  comparisons. 

This  table  is  to  be  read  as  follows:  In  Gary,  the  fifth  grade  children 
read  silently  a  simple  story  at  the  rate  of  166  words  per  minute.  Ac- 
cording to  Starch  the  average  rate  at  which  fifth  grade  children  should 
read  stories  silently  is  168  words  per  minute.  In  Salt  Lake  City  the 
average  rate  for  the  fifth  grade  was  189  words  per  minute.  According 
to  Gray  the  rate  is  152  words  per  minute  on  difficult  material  or  204 
words  per  minute  if  adjustment  is  made  for  difficulty.  (Based  on  scores 
of  13  cities  of  Iowa,  Minnesota,  Tennessee,  and  Illinois.)  TTig  method  of 
computing  rate  of  silent  reading  is  considered  later.  According  to  Brown, 
the  best  average  made  by  any  fifth  grade  class  so  far  measured  by  him 
is  269  words  per  minute.  According  to  Courtis,  the  average  rate  of  nor- 
mal reading  is  180  words  per  minute;  of  careful  reading  133  words  per 
minute. 
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Figure  48 
Sample  Paragraphs  from  the  Kansas  Silent  Reading  Tests 

PARAGRAPH  A  FROM  TEST  I 

I  have  red,  green  and  yellow  papers  in  my 

Value     hand.    If  I  place  the  red  and  green  papers  on 

1 . 2       the  chair,  which  color  do  I  still  have  in  my  hand? 


PARAGRAPH  B  FROM  TEST  I 

Think  of  the  thickness  of  the  peelings  of 
Value     apples  and  oranges.    Put  a  line  around  the 

1 . 2  name  of  the  fruit  having  the  thinner  peeling. 

Apples    Oranges. 

PARAGRAPH  C  FROM  TEST  II 

In  going  to  school,  James  has  to  pass  John's 
Value     house,  but  does  not  pass  Frank's.    If  Harry 

2.3  goes  to  school  with  James  whose  house  will 
Harry  pass,  John's  or  Frank's? 


Value 
2.6 


PARAGRAPH  D  FROM  TEST  II 

Here  are  two  squares.  Draw  a  line  from  the 
upper  left-hand  corner  of  the  small  square  to 
the  lower  right-hand  corner  of  the  large  square. 


□ 
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Figuse  48— Continued 
PARAGRAPH  £  FROM  TEST  HI 

Bone  is  composed  of  animal  matter  and 
mineral  matter.  The  former  gives  it  tough- 
Value  ness  and  the  latter  rigidity.  Yesterday  I 
4.3  placed  a  bone  from  a  chicken's  leg  in  a  bottle 
of  acid,  and  found  this  morning  that  I  could 
wrap  the  bone  around  my  finger  like  gristle. 
Which  kind  of  matter  was  removed  from  the 
bone? 


PARAGRAPH  F  FROM  TEST  HI 

There  are  three  horizontal  lines;  the  first  is 
three  inches  in  length,  the  second  two  inches, 
the  third  one  inch.  We  know  that  if  the  second 
Value  and  third  lines  are  joined  end  to  end  the  re- 
4.8  suiting  line  will  be  as  long  as  the  first  line. 
Suppose  that  the  first  and  second  lines  are 
joined  end  to  end.  How  many  times  as  long 
as  the  third  line  will  the  resulting  line  be? 


Paragraphs  A  and  B  from  Test  I  (for  grades  three,  four,  and  five). 
Paragraphs  C  and  D  from  Test  II  (for  grades  five,  six,  seven,  and  eight). 
Paragraphs  E  and  F  from  Test  III  (for  high  school  grades). 

These  paragraphs  were  selected  as  representative  of  the  different  types 
of  activities  called  for  in  the  test    It  should  be  noted  that  while  the 
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Figure  48— -Continued 

reading  activity  is  one  element  determining  a  correct  response,  it  is  but 
one.  The  Kansas  Tests  probably  measure  the  ability  to  read  and  under- 
stand, and  to  think  or  reason  about  what  is  read . 

to  nine  and  Test  III  from  grades  eight  to  twelve.  The 
conventional  scores  were  found  according  to  the  standard 
instructions,  and  for  comparative  purposes  only  those 
results  were  taken  which  were  derived  from  tests  given 
to  the  grades  and  under  the  conditions  provided  in  the 
standard  directions.  For  the  discussion  of  the  condi- 
tions as  they  appear  at  Gary,  however,  the  results  from 
Test  I  (for  all  three  tests  in  the  eighth  grade)  were  tabu- 
lated for  rate  of  work  (total  number  of  points  attempted) 
and  for  accuracy  (ratio  of  points  right  to  points  at- 
tempted, expressed  as  rate  per  cent).1 

The  eighth  grade  at  Gary  attempted  23.9  points  in 
Test  I  with  an  accuracy  of  83  per  cent.  (Table  LXVIII, 
page  287).  Both  the  rate  and  the  accuracy  of  work  in- 
crease quite  regularly  throughout  the  grades  (third  grade 
12.3  points  attempted,  28  per  cent,  accuracy)  showing 
that  the  development  of  ability  in  rate  is  of  the  conven- 
tional type  (Figure  49,  page  288) .  The  ninth  grade  scores 
indicate  a  very  great  increase  over  those  of  the  eighth 
grade  (23.9  to  34.0  points  attempted  and  83  to  88  per 
cent,  accuracy),  but  this  may  be  due  to  difference  in  con- 
ditions under  which  the  tests  were  given. 

The  reader  should  note  that  the  level  of  accuracy 

1The  reader  should  note  that  this  method  of  tabulation  is  a  departure 
from  conventional  methods.    See  also  page  186. 
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reached  by  the  eighth  grade  is  83  per  cent.    That  is, 
half  the  children  in  the  eighth  grade  are  able  to  read  the 
simple  paragraphs  designed  for  the  measurement  of  the 
ability  of  the  third  grade  children  only  well  enough  to 
give  correct  answers  to  a  little  more  than  eight  out  oi 
ten  questions.    This  will  furnish  the  reader  with  a  basis 
of  estimating  directly  the  degree  of  ability  in  reading  of 
the  Gary  eighth  grade  children,  and  of  interpreting  the 
results  given  in  the  table.1 

In  terms  of  the  conventional  score  for  the  Kansas 
Test,  the  ability  of  the  eighth  grade  is  about  21  points 
(Test  I,  21.3  points;  Test  II,  18.7  points,  Test  HI, 
22.1  points).     (Figure  50,  page  289.) 

COMPARATIVE  DATA2 

The  scores  of  the  Gary  eighth  grade  classes  are  practi- 
cally the  same  as  the  corresponding  scores  made  by 
classes  in  other  cities  (Gary  18.7,  Kansas  20.1,  Iowa  20.6, 
Detroit  19.0,  New  Orleans  19.1,  combined  tabulations 
from  all  parts  of  the  country  19.2).  The  scores  of  the 
third  grade  at  Gary  are  very  much  below  those  made  by 
other  cities  (Gary  2.5,  median  of  country  5.3)  (Table 
LXDC,  page  290).  That  is,  the  abilities  measured  by 
the  Kansas  Reading  Tests  begin  to  develop  later  at  Gary 

1The  reader  should  note  that  this  statement  stands  by  itself  and  has  no 
significance  from  the  comparative  point  of  view.  In  the  absence  of  com- 
parative data,  it  is  impossible  to  say  whether  this  performance  is  better  or 
worse  than  that  of  children  in  conventional  schools. 

*See  page  3S. 
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TABLE  LXVm 
City  Wide  Median  Scores — Kansas  Reading  Tests 


G9LAXKE 

GARY 

TEST  I 

GARY 

TEST  if 

GARY 

test  n 

GARY 

BAR 

ACCUXACYf 

test  in 

3 

12.8 

28.2 

2.5 

__ 

— 

4 

14.9 

60.1 

6.7 

5.6 

— 

6. 

18.1 

59.3 

9.8 

9.9 

— 

6 

20.6 

68.5 

13.8 

11.9 

— 

7 

23.8 

75.2 

18.3 

15.3 

— 

8 

23.9 

83. 

21.3 

18.7 

22.1 

9 

20.4  34.0* 

87.8 

14.2  23.7* 

22.2 

24.5 

10 

20.1  33.5* 

92.7 

16.7  27.8* 

— 

29.2 

11 

20.2  33.7* 

97.7 

17.7  29.5* 

— 

33.5 

12 

18.5  30.8* 

95.7 

16.8  28.0* 

— 

29.6 

•Smaller  values,  actual  scores  made  in  three  minutes;  larger  values,  scores  computed 
basis  of  five  minutes,  the  standard  time  allowance. 

t  Disagreement  due  to  difference  in  method  of  tabulation. 


This  table  is  to  be  read  as  follows:  The  third  grade  children  in  the 
Kansas  Silent  Reading  Tests  attempted  12.3  points  in  the  time  allowed, 
of  which  28.2  per  cent  were  correct.  The  score  of  this  grade  computed 
in  the  manner  provided  by  Kelly  is  2.5  points.  The  score  of  the  ninth 
grade  in  Test  I  was  20.4  points  attempted  in  three  minutes,  from  which 
the  amount  that  would  have  been  done  in  five  minutes,  the  time  allowed 
the  lower  grades,  was  computed  to  be  34.0  points.  The  accuracy  of 
the  ninth  grade  was  87.8  per  cent.  The  score  of  the  ninth  grade  as 
determined  in  the  manner  provided  in  the  instructions,  14.2  points 
right  in  three  minutes,  or  23.7  points  right  in  five  minutes.  In  Test 
II  it  was  22.2  points  right  in  five  minutes,  and  in  Test  III  it  was  24.5 
points  right  in  five  minutes.  The  three  scores  23.7,  22.2,  and  24.5 
show  how  correctly  the  three  tests  have  been  weighted  to  give  scores  of 
equal  value. 
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Figure  49 
Development  in  Rate  and  Accuracy— Kansas  Silent  Reading 

Tests 
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The  scale  along  the  base  of  the  figure  shows  the  number  of  points  tried. 
The  scale  along  the  vertical  axis  shows  the  per  cent,  the  points  right  are 
of  the  points  tried.  The  position  of  the  various  grade  median  scares  is 
shown  by  the  figures  on  the  curve. 

Ability  to  read  the  simple  material  in  the  test  devised  for  the  third 
grade  develops  quite  regularly  through  the  seventh  grade.  The  eighth 
grade  has  a  score  which  is  greater  in  accuracy  only.  The  high  school 
grades  are  very  much  higher  in  rate  and  somewhat  higher  in  accuracy. 
The  break  between  the  eighth  and  ninth  grades  is  probably  due  in  part  to 
the  selective  action  of  promotion  to  high  school  upon  ability  and  in 
part  to  the  fact  that  in  these  grades  it  was  necessary  to  stop  the  tests 
at  the  end  of  three  minutes  and  compute  the  score  that  would  have  been 
made  in  five  minutes.  The  curve  for  the  development  of  reading  ability 
in  Gary  is  of  quite  the  conventional  type. 
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Figure  50 
Development  in  Accuracy — Kansas  Silent  Reading  Tests 

(Conventional  Scores) 
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The  scale  along  the  base  of  the  figure  represents  grades.  The  vertical 
scale  represents  the  score  in  Kansas  Silent  Reading  Tests  as  determined 
from  the  right  answers  according  to  the  standard  instructions.  The  solid 
line  represents  Gary;  the  dotted  line,  scores  made  by  about  9,000  chil- 
dren in  19  cities  in  Kansas. 

Hie  Gary  results  are  lower  than  the  Kansas  results  in  grades  three,  four, 
and  five,  and  equal  to  the  Kansas  results  in  grades  six,  seven,  eight,  and 
nine,  and  are  very  much  above  the  Kansas  results  in  other  high  school 


than  in  the  conventional  schools,  but  by  the  seventh  and 
eighth  grades  the  handicap  of  the  late  start  has  been  over- 
come. .  The  scores  in  high  school  years  at  Gary  are  equal 
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to  or  much  greater  than  the  corresponding  scores  in  other 
cities,  although  the  data  for  this  statement  are  less  com- 
plete than  for  the  comparisons  in  the  other  grades 
(twelfth  grade  29.6,  median  of  country  29.7).  As  meas- 
ured by  the  Kansas  Reading  Tests,  the  eighth  grade 
product  of  the  instruction  in  reading  in  the  grades  at 
Gary  is  equal  to  that  of  the  conventional  school. 

TRABUE  LANGUAGE  SCALES 

The  Trabue  Language  Scales  consist  of  ten  sentences 
differing  in  complexity,  in  which  one  or  more  words  are 
missing.  Children  are  asked  to  supply  the  missing 
words.  Three  representative  sentences  are  shown  in 
Figure  51,  page  292.  From  these  it  may  be  seen  that 
the  Trabue  Test  measures  a  complex  of  many  abilities,  of 
which  reading  is  but  one. 

The  scales  are  issued  in  the  form  of  four  tests,  known 
respectively  as  B,  C,  D,  and  E.  A  child  is  given  as 
much  time  as  he  needs  and  his  score  does  not  show  the 
amount  of  work  done  in  this  time,  but  the  difficulty  of 
the  most  difficult  sentence  that  he  is  able  to  complete. 
Thus  except  for  certain  minor  irregularities  in  scoring  im- 
perfect answers,  a  child  who  is  able  to  complete  correctly 
sentence  2  in  Figure  51  would  be  given  a  score  of  10,  while 
a  child  who  was  able  to  complete  correctly  sentence  3  in 
Figure  5 1  would  be  given  a  score  of  20.  Sentence  3  is  twice 
as  hard  as  sentence  2  in  terms  of  the  units  adopted  for 
measurement  of  the  difficulty  of  these  sentences.  The 
score  of  a  class,  therefore,  is  to  be  interpreted  in  terms 
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16.0  at  the  twelfth  grade,  the  value  which  best  represents 
eighth  grade  ability  is  probably  13.6  (Table  LXX,  page 
294) .  The  development  in  the  lower  grades  is  rapid  (from 
6.5  at  the  third  grade  to  8.5,  or  2.0  points  increase  at  the 
fourth  grade)  and  gradually  decreases  as  the  upper  grades 
are  reached  (eighth  to  ninth  grade  difference,  .7;  eleventh 
to  twelfth  grade  difference,  .5).  The  development  is, 
therefore,  of  the  conventional  type  (Figure  52,  page  295). 

COMPARATIVE   DATA1 

A  comparison  of  the  Gary  scores  with  the  standards 
published  by  Trabue  shows  almost  perfect  correspondence 
at  all  grades  (third  grade  difference,  Gary,  +.5;  eighth 
grade  difference,  +.3;  twelfth  grade  difference,  — .2). 
The  differences  are  sometimes  positive  and  sometimes 
negative  and  are  insignificant  in  amounts.  Comparisons 
with  such  other  scores  as  are  available  show  the  Gary 
scores  slightly  lower  (eighth  grade  comparisons,  Gary — 
Detroit,  — 2.4;  Gary — Chatham,  — 2.1;  Gary — Nassau 
County, — .4).  Table  LXXI,  page  296.)  On  this  basis 
the  Gary  schools  would  be  judged  slightly2  below  conven- 
tional schools  in  the  products  measured  by  these  scales. 

SCHOOL  TO   SCHOOL  COMPARISONS 

Rather  marked  differences  in  reading  ability  from 
school  to  school  are  evident.  Thus,  in  oral  reading  the 
Emerson  School  is  distinctly  better  than  the  other  three 
schools.     In  silent  reading  the  Froebel  and  Beveridge 

1Scc  page  38. 

•See  footnote,  page  296. 
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Figure  52 
Development  of  Abilities— Trabue  Language  Scales 
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The  scale  along  the  base  of  the  figure  represents  grades.  The  scale 
along  the  vertical  axis  indicates  the  median  degree  of  development 
attained  in  Trabue  units.  The  solid  line,  Gary.  The  broken  line, 
Trabue  Standard.    The  dotted  line,  actual  score  of  Tests  B  and  C. 

The  Gary  children  do  quite  as  well  as,  or  a  little  better  than,  children 
in  other  schools,  according  to  Trabue's  Standard. 

therefore,  that  the  differences  summarized  in  Table 
LXXII,  page  298,  represent  merely  the  effect  of  foreign 
parentage  rather  than  any  significant  difference  due  to 
the  extent  to  which  the  schools  afford  modern  programs. 

*Sec  footnote  on  page  206. 
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TRABUE  LANGUAGE  SCALE  B 
Median  Scores  Grades1 


CITIES 


Trabue  Standard.  . . 


Omaha  (January)... 


Louisville  (White) 

St.  Paul 

Rochester,  N.  Y 

Horace  Mann  School 
(Columbia  University) 

Pacific  School-Seattle. 

Nassau  Co.,  N.  Y 

Janesville,  Wis 

Mobile,  Ala 

Chatham,  N.  J 

Deadwood,  S.  Dak. . . . 

Muscatine,    la.    (Janu- 
ary)  

Cherokee,  la 

Washington,  la.  (June) 

Webster      City,      I . . 
(March) 

Storm  Lake,  la.  (Nov.) 

Sutherland,  la 

Gary 


5th 


11.4 
5A-11 . 1 
6B-11.6 

11.0 
5A-10.9 
5B-11.0 
12.4 
11.1 
11.55 


10.9 

11.3 
11.7 
11.7 
11.0 


10 
10 
12 


9 
9 
3 


11.8 
11.4 
11 
10 


3 
2 


6th 


12.4 
6A-12.2 
6B-12.6 

12.4 
6-A12.4 
6-B12.5 
13.5 
12.2 
12.57 


12.1 
12.4 
12.8 
12.7 
12.2 
12.8 


12 
12 
13 


2 

7 
4 


12.6 
129 
13.4 
10.7 


7th 


13.4 
7A-13.1 
7B-13.6 

13.4 
7A-13.3 
7B-13.6 
14.4 
13.1 
14.37 

16.2 
13.3 

13.6 
13.9 
14.8 
13.0 

13.1 
14.0 
14.0 

13.2 
13.2 
13.5 
12.9 


8th 


14.4 
8A-14.1 
8B-14.6 

14.7 
8A-14.7I 
8B-14.8 
15.3 
14.0 
14.75 

17.1 
13.7 
14.0 
14.3 
14.4 
16.8 
13.3 

14.3 
14.4 
14.4 

14.9 
13.9 
14.4 
14.0 
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§2.    Critical  Discussion 

gray's  oral  reading  scale 

A  consideration  of  each  of  the  various  reading  tests 
is  now  in  order,  the  first  of  which  is  Gray's  Oral  Reading 
Scale.    No  discussion  of  the  methods  by  which  the  scale 
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™ 

t^IM 

+ 

- 

Frocbel 
Emerson 
Jefferson 

Beveridge 

28 
12 
16 
11 

7 
6 
2 
2 

7 
1 
3 

2 

KANSAS  SILENT  BEADING  TESI 

m 

c^ks" 

+ 

16 

1 
1 
9 

Frocbel 
Emerson 
Jefferson 

Beveridge 

3D 
15 
16 
IS 

6 
6 
10 

a 

This  table  is  to  be  read  as  follows: 
tested  in  oral  reading,  7  classes  had 
medians  and  j  classes  had  scores 
reading,  of  31  classes  tested,  9  had 
medians  and  3  below.  In  rate  of 
7  had  scores  markedly  above  the  < 
the  Kansas  Silent  Reading  Tests,  of 
edly  above  the  city  wide  m"*-- 
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was  derived  or  of  the  principles  upon  which  it  is  based  is 
necessary,  as  the  same  are  available  elsewhere.1 

There  are,  however,  certain  criticisms  of  the  scale 
itself  and  certain  points  connected  with  its  use  at  Gary 
'which  the  reader  should  know. 

Gray's  Oral  Reading  Scale  as  given  at  Gary  measures 
*the  ability  to  "  sound"  connectedly  and  correctly  the 
^words  in  a  given  passage.  It  does  not  measure  expression 
nor  does  it  measure  understanding.  The  attempt  was 
xnade  to  have  the  examiner  record  his  judgment  as  to  the 
<juality  of  the  expression  in  the  reading,  but  it  was  found 
impossible  to  compare  expression  with  a  standard,  except 
on  the  most  intangible  and  subjective  basis.  The  ex- 
aminers soon  decided  their  records  were  so  unreliable  as 
to  be  worthless,  and  the  practice  was  discontinued. 

It  is  possible  to  ask  a  child  to  reproduce  orally  or  in 
writing  what  has  been  read,  or  to  answer  questions  in 
regard  to  the  same.  However,  this  was  not  attempted 
systematically  at  Gary.  The  scale  thus  affords  merely 
a  measure  of  the  child's  performance  in  oral  reading.  If 
he  mispronounces  words,  omits  words,  or  adds  to  the 
text,  if  he  repeats,  or  otherwise  misreads,  the  errors  take 
away  from  his  score.  The  giving  of  the  test  is  well 
standardized  and  there  are  adequate  and  reliable  com- 
parative data.  As  a  whole,  therefore,  the  test  is  a  satis- 
factory measure  of  skill  in  oral  reading,  so  far  as  that  skill 

*Gray,  W.  S.,  Studies  of  Elementary  School  Reading  through  Stan- 
dardized Tests.  Supplementary  Educational  Monograph  No.  1,  Uni- 
versity of  Chicago  Press. 


300  THE  GARY  SCHOOLS 

is  defined  as  ability  to  pronounce  words  correctly  and  in 
proper  sequence. 

One  limitation  of  the  scale  is  that  the  causes  of  the 
increasing  difficulty  from  paragraph  to  paragraph  are 
not  known  and  may  be  due  to  factors  not  vitally  con- 
nected with  reading  ability.  A  single  illustration  will 
make  this  plain.  Paragraphs  2,  3,  and  4  have  for  the 
first  word  "  once  " ; "  Once  there  was,"  "  Once  there  were," 
"Once  there  lived."  Paragraph  5,  however,  begins 
"One  of  the  most  interesting  birds."  Child  after  child 
influenced  by  the  preceding  paragraphs  begins:  "Once 
of  the."  Thus  in  class  No.  11  Jefferson,  out  of  40 
children,  5  missed  on  this  particular  point.  Tabula- 
tion of  other  classes  yielded  similar  results.  In  general, 
one  child  in  10  is  so  susceptible  to  the  habit  forming 
influence  of  the  succession  of  "onces"  that  he  will  mis- 
read "one"  in  paragraph  5.  In  other  words,  in  working 
with  the  scale  one  gains  the  impression  that  the  difficulty 
of  certain  paragraphs  of  the  scale  is  in  part  caused  by 
the  occurrence  of  certain  traps  or  pitfalls,  rather  than 
by  real  increases  in  the  difficulty  of  the  reading.  This 
is  due,  of  course,  to  the  empirical  basis  on  which  the 
selection  of  the  paragraphs  rests. 

On  the  other  hand,  an  inspection  of  the  scale  shows 
that  there  is  in  general  a  real  increase  in  difficulty  of  vo- 
cabulary, in  length  of  sentence,  in  difficulty  of  sentence 
structure,  and  in  content  of  material. 

The  length  of  the  sentences  in  the  various  paragraphs 
increases  from  7   words  to  23,   but   the  increase  is 
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irregular  (Table  LXXIII,  page  302).  A  word,  however, 
is  a  poor  unit  to  use  in  measuring  length  of  sentence  as 
words  vary  so  in  size.  A  better  unit  would  seem  to  be 
"sound  divisions"  or  syllables.  A  syllable  in  the  con- 
ventional sense  means  merely  a  group  of  letters.  Sylla- 
bles are  not  always  sounded.  Thus  "dressed,"  a 
two  syllable  word  on  the  basis  of  spelling,  is  pronounced 
as  though  it  were  spelled  "drest";  that  is,  as  a  single 
sound  unit.  The  term  "sound  division"  or  "sound 
unit"  will  be  used  to  indicate  that  words  have  been 
divided  in  accordance  with  their  pronunciation,  and 
not  in  accordance  with  their  spelling. 

From  the  point  of  view  of  such  sound  division,  the 
sentence  length  varies  from  an  average  of  8  to  over 
45  units,  but  again  the  increase  is  irregular.  To  say 
that  a  child  can  read  one  paragraph  but  cannot  read 
the  next  higher  one  is  to  indicate  but  roughly  the  length 
of  sentence  which  he  can  read.  In  future  improvements 
of  the  scale  attention  will  need  to  be  given  to  a  more 
careful  gradation  of  sentence  length  from  paragraph  to 
paragraph. 

In  similar  fashion  it  was  found  that  a  word  of  six 
sound  divisions  occurs  in  paragraphs  10  and  11,  while 
paragraph  2  has  but  five  words  of  more  than  a  single 
sound.  Roughly,  therefore,  the  paragraphs  increase 
in  difficulty  because  of  the  increase  in  the  length  of  the 
words  as  well  as  the  length  of  the  sentence,  but  for  this 
factor  also  the  increase  is  irregular  (Table  LXXIV,  page 

303)- 
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TABLE  LXXm 
Analysis  op  Gray's  Oral  Reading  Scale 


PARAGRAPH 
NUMBER 

NUMBER 

OP 
SENTENCES 

IN 
PARAGRAPH 

NUMBER 

OP 
WORDS 

NUMBER 
OP  SOUND 
DIVISIONS 

AVERAGE 

WORDS 

PER 

SENTENCE 

AVERAGE 

SOUNDS 

PER 

SENTENCE 

1 

7 

48 

55 

6.9 

7.9 

2 

6 

49 

54 

8.1 

9.0 

3 

5 

49 

61 

9.8 

10.2 

4 

6 

61 

72 

10.1 

12.0 

5 

3 

60 

76 

20.0 

25.0 

6 

4 

62 

81 

15.5 

20.2 

7 

3 

53 

74 

17.7 

24.7 

8 

4 

64 

89 

13.5 

22.2 

9 

4 

52 

82 

13.0 

20.5 

10 

5 

46 

85 

9.2 

17.0 

11 

2 

47 

89 

23.5 

44.5 

12 

2 

38 

91 

19.0 

45.5 

This  table  is  to  be  read  as  follows:  In  paragraph  i  of  Gray's  Oral 
Reading  Scale  there  are  7  sentences  containing  a  total  of  48  words.  In 
reading  these  words  orally  there  are  55  separate  sound  syllables  or  di- 
visions. (See  Test.)  That  is,  the  average  length  of  a  sentence  is  6.9 
words  or  7.9  sounds. 

As  a  whole,  the  table  shows  that  the  average  length  of  the  sentences 
in  words  increases  irregularly  from  the  beginning  to  the  end  of  the  scale; 
that  in  terms  of  the  number  of  sound  divisions  the  average  length  also 
increases  irregularly. 


The  vocabulary  also  increases  in  difficulty  from  para- 
graph  to  paragraph.  The  various  words  new  to  each 
paragraph  are  shown  in  Table  LXXV,  between  pages  304 
and  305.  Unfortunately,  there  is  no  information  as  to 
the  relative  frequency  of  occurrence  of  words  in  children's 
reading  vocabularies,  so  that  it  is  impossible  to  evaluate 
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the  difficulty  of  the  paragraphs  from  this  point 
However,  in  each  paragraph  the  children  are  call 
to  read  24  or  25  new  words1  and  these  word 
general,  increasingly  longer  and  less  common  fn 
graph  to  paragraph. 

The  analysis  will  not  be  pushed  further, 
has  been  said  to  show  the  incompleteness  of  ou 
edge  of  the  causes  of  the  increase  in  difficulty  fn 
graph  to  paragraph.  Nevertheless,  in  the  op 
the  author,  Gray's  Scale  is  one  of  the  most  sat 
of  the  various  measuring  scales  and  is  probably 
feet  as  it  can  be  made  on  the  basis  of  present  kn 

A  drawback  in  the  use  of  the  Gray  Scale 
children  must  be  measured  individually,  so  1 
measurement  of  a  school  system  requires  a  gi 
of  time.  For  this  reason,  in  certain  grades  at  G 
selected  children  were  measured.  The  method 
pling  was  as  follows: 

For  all  grades  from  two  to  eight  inclusive  the 
of  reading  were  asked  to  fill  out  for  each  child  in 
a  judgment  card  like  that  shown  in  Figure  53,  f 
In  grades  three,  five,  and  seven  practically  all  the 
were  tested.  In  the  other  grades  10  children  we) 
from  each  class — the  3  given  the  highest  mark 
teacher,  the  3  lowest,  and  the  4  nearest  the  cent 
class.    In  all  1557  children  were  examined. 

This  method  of  measurement  by  sampling  is 
question.    If  10  children  are  chosen  from  a 
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were  filled  out  by  the  teachers  for  each  individual  child  in  grades 
Selection  of  children  to  be  measured  by  Gray's  Oral  Reading 
■  made  on  the  basis  of  the  marks  shown  on  these  cards.  Ten 
wen  chosen  from  each  class,  the  group  bung  composed  of  the 
ving  the  highest  rating,  the  three  having  the  lowest  rating,  and 
of  average  ability.  In  grades  3,  5,  and  7,  practically  all  the 
daawelL 
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40,  the  class  will  be  misrepresented  by  the  resulting 
score  based  upon  the  performances  of  the  10  children, 
unless  the  teacher's  judgment  is  reliable.  In  grades 
three,  five,  and  seven  practically  the  full  class  member- 
ship (except  for  absence)  was  measured  in  each  class, 
although  the  scores  were  tabulated  in  groups  of  10  as  in 
other  grades.  This  makes  it  possible  to  determine  the 
extent  to  which  the  method  of  sampling  is  valid. 

It  was  found  that  the  differences  between  the  scores 
made  by  all  the  children  in  a  class  and  by  the  selected 
group  were  small,  averaging  3  points.  This  is  less 
than  the  usual  error  of  measurement  (Table  LXXVI, 
page  308). 

The  examiners  at  Gary  were  the  writer  and  his  assist- 
ants and  six  graduate  students  from  Professor  Gray's 
own  classes  in  the  University  of  Chicago.  These  latter 
were  trained  in  the  use  of  the  scale  by  Professor  Gray 
himself.  As  a  further  precaution  a  large  part  of  the 
first  day's  scoring  was  done  in  duplicate.  That  is,  as 
a  child  read  from  the  scale,  two  examiners  made  inde- 
pendent records,  and  after  the  child  had  left  the  room 
the  two  records  were  compared  and  doubtful  entries 
discussed. 

A  study  of  these  duplicate  scores  makes  it  plain  that 
the  use  of  the  scale  leads  to  consistent  records  of  children's 
performances.  (Tables  LXXVII  and  LXXVIII,  pages 
309-310).  In  21  per  cent,  of  the  cases  the  two  records 
agreed  exactly;  in  70  per  cent,  of  the  cases  the  differences 
were  2.5  points  or  less  (about  half  a  year's  growth), 
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and  in  only  6  per  cent,  was  the  Tdifference  greater 
than  5  points.  There  were  differences  in  the  close- 
ness of  the  agreement  between  the  records  of  the  vari- 
ous pairs  of  judges,  and,  in  general,  those  who  had  had 
the  most  experience  in  the  use  of  the  scale  had  the  most 
consistent  records.  None  of  the  five  pairs  of  judges  had 
an  average  difference  of  as  much  as  3  points.  The  results 
may,  therefore,  be  accepted  as  revealing  the  true  abilities 
of  the  children  within  a  half  grade.  In  this  test,  as  else- 
where, the  class  averages,  as  determined  by  two  indepen- 
dent observers,  differ  by  very  small  amounts. 

In  making  the  comparisons  noted  above  it  was  found 
that  there  were  often  marked  disagreements  as  to  the 
actual  mistakes  made,  particularly  when  the  number  of 
mistakes  was  large,  but  usually  close  agreement  as  to 
the  number  of  mistakes.  Two  sets  of  independent  rec- 
ords are  given  in  Figure  54,  pages  312-313.  For  para- 
graph 3,  the  two  records  agree;  for  paragraph  5  there  is 
a  disagreement  of  2  seconds  in  the  time  records  and  of  1 
error  in  the  mistakes.  In  both  will  be  found  differences 
in  the  manner  in  which  the  mistakes  were  recorded.  As, 
however,  six  is  the  maximum  number  of  mistakes:  that 
can  be  made  and  have  the  performance  count  at  all,  the 
differences  are  not  serious  and,  as  has  been  shown,  do 
not  affect  many  scores.  On  the  other  hand,  they  in- 
validate the  records  so  far  as  types  of  mistakes  are  con- 
cerned and  no  such  analyses  were  attempted. 

The  method  of  scoring  adopted  by  Professor  Gray 
yields  approximately  constant  scores  from  grade  to  grade. 
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Thus  the  fourth  and  fifth  grades  at  Gary  made  an  aver- 
score  of  39.  This  does  not  mean  that  they  have  the 
ability;  it  means  merely  that  the  fourth  grade  is  as 
the  fourth  grade  standard  as  the  sixth  grade  is  near 
the  sixth  grade  standard.  To  indicate  the  progress 
from  grade  to  grade  it  is  necessary  to  convert  the  actual 
scores  into  relative  scores  by  adding  20  points  to  the 
second  grade  score  and  5  points  additional,  the  average 
yearly  progress,  for  each  grade  above  the  second.  In 
the  form  of  graph  adopted  by  Professor  Gray  this  allow- 
ance is  made  by  shifting  the  grade  scales.  If,  however, 
the  Gary  grade  averages  were  reduced  to  units  on  a  single 
scale  with  an  arbitrary  zero  point,  the  scores  would  have 
the  values  shown  in  Table  LXXIX. 


TABLE  LXXDC 

.'   Gary  Results,  Gray's  Oral  Reading  Scale,  Expressed  in  Terms 
:  of  a  Single  Uniform  Scale 


GRADE 

2 

27 

47 

3 

36 

61 

4 

39 

69 

1 

5 

6 

7 

8 

Gary  Average 

39 

74 

41 
81 

42 

87 

41 

.    Gary  Values  on  a  uniform  scale.  . . 

91 

In  Figure  46,  Section  I,  of  this  chapter  the  scale  along 
the  vertical  axis  enables  the  conventional  graph  to  be 
read  in  terms  of  a  single  scale. 

It  will  be  noted  that  the  increase  in  score  from  grade 
to  grade  is  small  (5  points  after  the  second  grade).  The 
conventional  method  of  scoring  obscures  this  fact  as  all 
grades  make  nearly  the  same  score,  and  within  each 
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Figure  54 
Agreement  and  Disagreement  in  Scoring  of  Two  Judges 

Judge  A 
Paragraph  3 

Once  there  were  a  cat  and  a  mouse. 
They  lived  in  the  same  house.  The  cat 
bit  off  the  mouse®  tail.  "Pray,  puss& 
said  the  mouse,  "give  me  my  long  tail 
again." 

"No"  said  the  cat,  "I  will  not  give 
you  your  tail  till  you  bring  me  some 
milk." 

Paragraph  5 

&&  ALCfuJU/  5      It    /VYU/fcAbUJ 

Qne  of  the  most  interestin^birdsyhich  )[ver 
lived  in  my  bird-room  was  a^blwjay  named  * 
Jacl^)  He  was  full  of  business  from  morning 
till  night,  scarcely  \ ver  still.  He  had  been  stolfiS 
from  a  nest  long  before  he  could  fly,  and  he  had 
been  reared  in  a  house  long  before  he  had  been 
given  to  me  as  ajge^ 
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Judge  B 
Paragraph  3 

tS  s*con*%  3    3  7ni&Ta*e&. 

if 

Ond^  there  were  a  cat  and  a  mouse; 
They  lived  in  the  same  house.  The  cat 
bit  off  the  mouse©tail.  "Pray,  puss& 
said  the  mouse,  "give  me  my  long  tail 
agaitf." 

"No"  said  the  cat,  "I  will  not  give 
you  your  tail  till  you  bring  me  some 
milk." 

Paragraph  5 

60  seconds  5   "  Trusted** 

Qneof  the  most  interesting  birds  which&rer 
lived  in  my  bird-room  was  a  -bluo  jay  named 
Jacl^e)  He  was  fall  of  )tmsins?s  from  morning 
till  night,  tjQUS&ly  Ver  still.  He  had  been  stole© 
from  a  nest  long  before  he  could  fly,  and  he  had 
been  reared  in  a  house  long  before  he  hdd  been 
given  to  me  asajret. 


In  paragraph  5,  although  the  actual  number  of  mistakes  recorded  dif- 
«s  but  1,  the  errors  recorded  are  differently  marked. 
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grade  the  range  of  scores  is  large.  For  instance,  in 
Froebel  class  No.  44  (seventh  grade)  the  lowest  individual 
score  was  20,  the  highest  52.  There  were  4  scores  in 
the  twenties,  6  in  the  thirties,  16  in  the  forties,  and  2 
in  the  fifties.  In  other  words,  the  individual  scores 
show  a  range  of  variation  equal  to  more  than  six  times 
the  average  yearly  progress,  yet  the  maximum  differ- 
ence between  the  scores  of  the  24  seventh  grade 
groups  (of  approximately  ten  children  each)  examined 
was  10.4  points.  In  other  words,  differences  between 
scores  of  classes  and  of  cities  are  much  more  significant 
than  differences  between  individuals.  The  differences 
between  the  Gary  and  Grand  Rapids  scores,  for  instance, 
may  be  transformed  into  years  of  difference  by  dividing 
by  5.  Whether  or  not  differences  between  individual 
scores  may  be  similarly  transformed  must  await  the 
evaluation  of  the  results  of  repeated  measurement  of  the 
same  individual  children  from  grade  to  grade. 

REPRODUCTION  TESTS 

The  reading  and  reproduction  tests,  as  already  pointed 
out,  measure,  on  the  one  hand,  rate  of  reading,  and,  on 
the  other,  a  complex  ability  made  up  mainly  of  three 
elements — ability  to  comprehend  something  of  what  was 
read,  ability  to  remember  the  same  for  several  minutes 
while  engaged  in  reproducing  it,  and  ability  to  organize 
the  words  and  ideas  remembered  into  a  connected  story. 
Ability  to  read  must  be  credited  with  a  minor  part  in 
determining  a  child's  score. 
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The  rate  of  reading  in  itself  is  probably  not  a  significant 
measure  of  anything.  The  range  of  variable  perform- 
ance is  so  large  that  rate  scores  in  such  tests  as  were 
given  at  Gary  are  mere  symptoms,  that  is,  slight  changes 
in  the  conditions  or  incentives  to  reading  effort  produce 
large  variations.  Inferences  from  symptoms  are  reliable 
only  when  the  conditions  which  cause  variation  in  the 
symptom  are  known.  For  instance,  the  rate  of  silent 
reading  at  Gary  appears  low  in  comparison  with  results 
from  similar  tests  in  other  cities.  So  far  as  is  known, 
there  is  nothing  in  the  tests  and  the  testing  conditions 
which  would  cause  this  variation,  and  the  results  from 
two  or  three  trials  of  different  tests  are  consistent.  Yet 
the  same  tests,  as  has  been  pointed  out,  bring  to  light 
very  large  individual  variations.  The  causes  of  these 
are  not  known.  The  reader  must  be  careful,  therefore, 
to  remember  that  while  every  precaution  has  been  taken 
to  make  the  data  reliable,  there  is  so  much  to  be  learned 
about  reading  ability  that  later  investigations  may 
prove  present  conclusions  to  have  been  unwarranted. 

Analysis  of  the  reproductions,  item  by  item,  affords 
an  opportunity  for  showing  the  manner  in  which  the 
factor  of  memory  operates.  Certain  subdivisions  of  the 
story  were  recalled  by  nearly  all  who  read  them.  These 
elements  form  the  gist  of  the  story.  Within  each  sub- 
division, however,  there  is  marked  decline  of  the  fre- 
quency with  which  the  various  items  were  recalled,  and, 
in  general,  the  longer  an  item  has  to  be  remembered  the 
less  frequently  it  is  recalled.    That  is,  only  children 
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Figure  55 
Relative  Frequency  of  Recall 


RELATIVE  FREQUENCY  OF  RECALL-  REPRODUCTION  .     j 


A  ftC  OEFGH  I  J  KLMNOPORSTUVWXYZ 

THOUGHTS 
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The  scale  along  the  base  of  the  figure  represents  the  various  thought 
divisions  in  the  key  for  scoring  the  reproduction  test.  The  scale  along 
the  left  hand  vertical  axis  shows  the  per  cent,  of  recall.  "Number  of 
cases"  shows  the  average  number  of  children  recalling  each  of  the  thought 
groups.  The  grouping  is  on  the  basis  of  connected  thoughts.  For 
instance,  thought  divisions  A,  B,  C,  D,  E,  F,  and  G  all  have  to  do  with 
the  fact  that  Fred  was  late  to  breakfast.  Solid  lines  show  the  relative 
frequency  of  recall  for  the  various  thoughts  within  each  division. 

It  should  be  noted  that  some  thoughts  are  of  more  importance  than 
others,  that  in  general,  the  first  thought  in  each  division  is  the  important 
one.  The  figure,  as  a  whole,  illustrates  the  part  memory  plays  in  deter- 
mining the  material  reproduced. 
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with  exceptional  memories  will  reproduce  the  minor 
items  of  any  main  thought  (Table  LXXX,  page  316, 
Figure  ss,  page  317). 

A  careful  analysis  was  made  of  the  data  from  one  of 
the  major  subdivisions,  thoughts  A  to  G,  which  tell 
about  Fred's  coming  late  to  breakfast  and  the  reason  for 
his  tardiness.  Within  each  thought  the  items  were 
further  classified  into  groups  on  the  basis  of  percentage 
of  recall  (Table  LXXXI,  between  pages  318  and  319). 
A  study  of  these  results  shows  that  it  is  the  modifying 
ideas  and  the  unusual  words  or  unusual  forms  of  expres- 
sion which  are  forgotten  or  avoided.  For  instance,  sub- 
division B  was  the  sentence :  "  So  late,  that  all  the  other 
members  of  the  family  were  through  and  had  gone  about 
their  respective  duties,"  and  subdivision  E:  "Whereas 
he  was  usually  in  bed  by  nine."  Only  26  per  cent,  of  the 
children  who  reproduced  some  portion  of  thought  B  wrote 
"respective,"  and  but  31  per  cent,  of  those  who  repro- 
duced some  part  of  thought  E  wrote  "whereas."  It 
would  be  pure  assumption  to  say  that  only  those  who 
understood  the  meaning  of  these  words  reproduced  them. 
Absolute  proof  is  lacking  that  the  selective  action  evident 
in  the  table  may  be  attributed  to  memory,  but  the  evi- 
dence is  sufficiently  clear  to  make  it  proper  to  question 
reproduction  scores  as  measures  of  reading  ability. 

RELATIVE  RATES   OF  ORAL  AND   SILENT  READING 

Closely  connected  with  the  measurement  of  rate  of 
reading  is  the  comparison  of  the  rate  of  oral  with  the 
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of  silent  reading.  Such  a  comparison,  however, 
obably  something  quite  different  from  what  it  ap- 
s  to  be.  Rate  of  oral  reading  is  based  upon  the 
ber  of  words  read  per  unit  of  time,  and  rate  of  silent 
ing  is  based  upon  the  number  of  words  read  per  unit 
ime.  In  words  these  two  statements  appear  to 
n  the  same  thing,  but  account  must  be  taken  of  the 
rences  in  the  activities  connoted  by  the  word 
id."  In  oral  reading,  a  word  read  means  a  word 
ided;  in  silent  reading,  a  word  is  read  when  it  has 
.  passed  in  the  course  of  the  reading  activity.  These 
methods  of  reading  have  in  common  the  movement 
le  eyes  by  which  the  various  words  are  brought  into 
'  one  after  another,  but  in  oral  reading  the  limiting 
>r  is  the  rapidity  with  which  the  words  can  be  voiced, 
oscular  reaction,  and  in  silent  reading  the  limiting 
?r  is  the  rapidity  with  which  the  words  seen  can  be 
prehended,  a  mental  activity.  The  two  activities 
y  have  nothing  in  common  that  may  properly  be 
pared,  and  the  comparison  is  valid  only  in  the 
e  of  comparing  the  total  material  covered  in  a  given 
:.  For  this  purpose  the  word  is  as  good  a  unit  as 
so  long  as  the  material  is  unselected,  so  that  the 
age  length  of  word  is  constant.  But  for  Gray's 
e  the  average  length  of  word  is  not  constant.  It 
»  from  3.3  letters  per  word  in  paragraph  1  to  6.7 
rs  per  word  in  paragraph  twelve.    In  terms  of  sylla- 

or,  better,  "sound  divisions,"  the  average  varies 
1  1.04  per  word  in  paragraph  3  to  2.5  in  paragraph 
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12.  Moreover,  the  various  sound  divisions  vary  in 
length  depending  upon  their  position  in  the  sentence  and 
the  nw  ml)  of  prolonging  them  in  critical  positions  to 
gnrc  the  proper  expression  in  reading.  The  number  of 
pauses  that  must  be  made  between  sentences  also  affects 
the  rate  at  which  the  oral  reading  activity  proceeds. 
The  word  is,  therefore,  very  far  from  being  a  proper 
unit  in  which  to  express  rate  of  oral  reading  when  at- 
tempting to  measure  the  rate  at  which  the  activity 
itself  proceeds..  For  this  reason,  Gray's  statement  that 
"as  the  subject  matter  to  be  read  increases  in  difficulty 
the  rate  of  reading  is  decreased,  although  no  errors  may 
be  made  "  is  not  true  in  so  far  as  it  rests  upon  the  evidence 
of  increasing  time  to  read  the  more  difficult  paragraphs. 
For  it  is  quite  impossible  to  read  paragraphs  2  and  12 
even  after  thorough  study  and  preparation  at  such 
rates  that  their  times  will  be  proportional  to  the  number 
of  words  they  contain,  provided  both  are  read  "nat- 
urally." For  instance,  the  writer's  time  to  read  the 
49  words  of  paragraph  2  was  11.7  seconds,  and 
for  paragraph  12,  which  contains  a  smaller  number 
of  words  (38),  was  16.3  seconds.  Expressed  in  terms 
of  sound  divisions  the  ratio  of  the  two  times  would 
be  1.68,  actually  it  is  1.39.  The  disagreement  is  due 
to  the  fact  that  allowances  must  be  made  for  the 
number  of  pauses.  The  curve  for  the  rate  of  oral  reading 
based  upon  words  read  per  minute  is  not  a  curve  which 
expresses  in  any  way  the  rate  of  development  in  the  rate 
at  which  oral  reading  activity  proceeds,  unless  in  every 
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case  the  identical  material  is  read.  For  this  reason  the 
curve  in  Figure  45  is  based  upon  the  average  rate  for 
paragraphs  1  and  2  in  every  grade.  The  curve  is  valid, 
therefore,  for  this  material  alone  and  may  not  represent 
the  general  development  for  rate  of  oral  reading. 

Neither  is  the  rate  of  silent  reading  in  words  per  minute 
a  measure  of  that  activity.  The  critical  factor  in  deter- 
mining the  rate  of  silent  reading  is  the  number  and  char- 
acter of  the  eye  shifts  or  pauses.  Material  is  not  read 
word  by  word.  The  length  of  a  word  has  little  to  do 
with  the  rapidity  with  which  it  is  perceived.  The  num- 
ber of  words  read  per  minute  of  time  is  not  a  measure  of 
the  activity  itself,  only  of  its  results.  The  curve  for 
silent  reading,  therefore,  does  not  represent  the  rate  of 
development  of  that  activity.  It  is  extremely  probable 
that  when  suitable  studies  have  been  made  so  that  true 
development  curves  for  these  two  rates  of  reading  may  be 
compared,  the  conclusions  will  be  quite  different  from 
those  suggested  by  the  present  rates  of  reading. 

For  the  present,  the  reader  should  understand  that  the 
two  curves  shown  in  Figure  47,  page  278,  yield  a  compari- 
son merely  of  the  amount  of  material  read  in  a  given  time. 
Such  a  comparison  has  a  diagnostic  value,  however,  for  if 
children  in  one  school  system  show  relatively  less  differ- 
ence between  the  amount  of  material  covered  in  a  given 
time  in  oral  and  silent  reading  than  children  in  other 
schools,  it  indicates  a  difference  in  the  relative  efficiency 
of  the  development  of  the  two  processes.  However, 
nothing  is  known  in  regard  to  the  way  these  curves  would 
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be  altered  by  a  change  of  material.  The  curves  of  other 
investigators  have  been  based  upon  different  material 
Whether  or  not  this  alone  is  responsible  for  the  differences 
in  the  Gary  scores  is  not  known.  The  Gary  curves  seem 
to  represent  a  real  and  an  unsatisfactory  condition,  bat 
the  reader  should  keep  in  mind  the  possibility  (although 
not  the  probability)  that  the  effect  noted  is  due  wholly 
to  the  use  of  faulty  units  of  computing  rate  of  reading. 

In  this  connection  a  second  possibility  must  be  consid- 
ered, the  manner  in  which  the  rate  of  reading  is  computed. 
Gray  used  the  harmonic  mean1  in  determining  rate  instead 
of  the  arithmetic  mean,  or  the  median  used  by  Brown, 
Oberholtzer,  Whipple,  Courtis,  and  other  investigators. 
The  harmonic  mean  usually  yields  a  smaller  result,  as 
Gray  himself  points  out.  The  Gary  rates  are  medians  of 
actual  rates.  That  is,  in  the  Gary  determinations  of  both 
oral  and  silent  reading,  the  actual  number  of  words  read 
per  minute  by  each  child  was  computed  from  the  time  and 
the  total  words  read,  and  the  median  of  these  rates  was 
taken  as  the  class  score.  In  Gray's  work,  however,  the 
times  required  to  read  one  hundred  words  were  averaged 
and  the  rate  computed  on  the  basis  of  the  average  time. 

An  illustration  will  make  the  difference  clear.  For 
instance,  if 
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lGray,.W.  S.,  Studies  of  Elementary  School  Reading  through  Standard 
Tests,  p.  105. 
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the  average  rate  would  be  220  words  per  minute.1    But 
from  the  same  record  it  might  be  computed  that 


Child  A  would  read  100  words  in  28. 1  seconds 
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500 

ind  the  average  time  to  read  one  hundred  words,  all  five 
children  reading  simultaneously  each  his  share,  would 
be  30.7  seconds. 

If  now  the  number  of  words  that  would  be  read  in  one 
minute  is  computed  from  the  average  time,  the  rate 
becomes  195  words  per  minute  instead  of  220  words 
per  minute  as  above.  Gray's  rates  of  silent  reading  are, 
therefore,  probably  lower  than  other  rates  because  of 
this  difference  in  the  method  by  which  they  are  computed. 
rhe  differences,  that  is,  are  favorable  to  Gary,  and  the 
maximum  difference2  would  amount  to  about  nine  words. 
rhe  effect  of  this  factor  is  probably  negligible.  However, 
the  reader  should  note  that  the  Gary  results  are  presented 
is  the  central  tendencies  of  the  actual  rates  of  reading 
(that  is,  the  rate  which  is  the  most  representative  of  the 
group)  and  not  the  time  that  would  be  required  by  the 
dass  to  read  the  paragraph  if  each  member  of  the  class 
were  to  read  a  part,  all  reading  for  the  same  length  of 
time,  each  at  his  own  rate.3 

*Median  rate,  200  words  per  minute. 

'Judging  from  the  data  given  on  p.  113  of  Gray's  Monograph. 
•See  also  Seventeenth  Yearbook,  National  Society  for  the  Study  of 
Education,  page  125. 
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KANSAS  SILENT  READING  TESTS 

The  Kansas  Silent  Reading  Tests  are  of  value  because 
they  measure  a  type  of  ability  that  is  extremely  impor- 
tant in  life,  the  ability  to  base  action  upon  judgments  in 
situations,  information  in  regard  to  which  has  been  ob- 
tained wholly  through  reading.  The  complex — reading 
judging,  acting — brings  into  play  a  great  number  of 
separate  abilities,  very  few  of  which  are  the  direct  prod- 
ucts of  classroom  training  in  reading. 

As  the  Kansas  Reading  Tests  measure"  the  perform- 
ance of  individuals  only  in  specific  situations,  any 
score  based  upon  a  single  phase  of  the  performance  is 
incomplete.  Accordingly,  the  Kansas  Tests  were  scored 
also  for  amount  attempted.  In  order  that  such  rate 
scores  might  be  comparable  with  the  conventional 
results,  the  assigned  values  for  each  exercise  were  used 
as  a  basis.  That  is,  a  child's  rate  score  was  found  by 
adding  the  assigned  values  of  every  exercise  for  Which 
the  child  wrote  an  answer,  whether  the  answers  were  right 
or  not.  Then,  in  certain  grades,  for  each  child  the  rela- 
tion between  the  conventional  score  and  the  rate  score 
was  expressed  as  a  rate  per  cent.  The  Kansas  Reading 
Tests  thus  yielded  three  sets  of  scores:  (i)  rate  score, 
the  number  of  points  completed  in  a  given  time;  (2) 
conventional  score;  the  number  of  points  allowed  for 
correct  answers;  and  (3)  accuracy  scores,  the  per  cent 
of  the  points  attempted  that  were  right. 

In  most  of  the  grades  the  accuracy  scores  were  found 
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TABLE  LXXXII 

Comparative  Results  of  Three  Trials  of  the  Kansas  Silent 

Reading  Tests1 
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ftBaaed  on  the  scores  of  4a  eighth  grade  pupils. 

This  table  is  to  be  read  as  follows:  40  per  cent,  of  the  42  eighth  grade 
children  were  found  to  have  maintained  the  same  position  in  the  group 
within  one  unit  of  variability  when  the  conventional  scores  from  Test  I 
were  compared  with  the  conventional  scores  from  Test  II.  The  table 
shows  that  the  correspondence  is  greatest  between  rate  scores  or  scores 
for  number  of  points  attempted  in  a  given  time,  and  lowest  for  the 
accuracy  scores,  or  the  per  cent  the  points  right  were  of  the  points 
attempted. 
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f  the  approximate  method  adopted  for  the  arithmetic 
sts.  The  record  sheet  was  so  arranged  that  the  mere 
try  of  the  rate  and  conventional  scores  indicated  the 
curacy  score  without  computation.  As  no  oom- 
rabk  data  are  available,  it  was  not  thought  necessary 
determine  the  size  of  the  error  in  such  short  cut 
>ulations.  It  cannot  be  large  and  the  method  was 
5d  consistently.  Small  errors  would  have  no  influence 
the  general  conclusions. 

As  affording  a  slight  indication  of  the  reliability  of 
»  Kansas  Tests,  the  coefficients  of  correspondence 
re  computed  for  scores  of  the  42  eighth  grade 
Qdren  present  in  all  tests.  The  correspondence  is 
sater  for  the  rate  scores  than  for  either  the  accuracy 
the  conventional  scores.  A  little  more  than  half  the 
Qdren  maintain  the  same  relative  position  in  the 
:>up  within  about  5  points  for  rate  of  work,  while  only 
proximately  a  third  of  the  group  maintain  the  same 
ative  positions  within  10  per  cent  for  accuracy 
d  5  points  for  the  conventional  score.  (Table 
DCXII,  page  325.) 

These  results  mean  that  a  single  measurement  of  a 
ild  with  the  Kansas  Tests  does  not  yield  very  reliable 
formation  in  regard  to  his  abilities.  He  may  do  very 
ich  better  or  very  much  worse  in  a  second  measurement 
th  a  different  test  (Table  LXXX1H,  page  326).  How 
uch  of  this  variation  is  an  indication  of  the  inefficiency 
training  and  how  much  of  it  is  due  to  the  defects  of 
e  tests  themselves  cannot  be  told  at  present 
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The  Kansas  Reading  Tests  are  interesting  as  throwing 
some  light  on  the  ability  of  the  Gary  children  to  solve 
the  conventional  arithmetical  problems.  The  exercises 
of  this  character  are  as  follows: 

TEST  I,  EXERCISE  XI 

"We  planted  three  trees  in  a  row.  The  first  one  was 
nine  feet  tall  and  the  last  one  was  three  feet  shorter  than 
the  first  one.  The  middle  one  was  two  feet  taller  than 
the  last  one.    How  tall  was  the  middle  one?  " 

(Assigned  value,  2.2.  Per  cent,  of  eighth  grade  chil- 
dren missing,  28.    Number  of  cases,  121.) 

TEST  I,   EXERCISE  XV 

"  Fred  has  eight  marbles.  Mary  said  to  him :  '  If  you 
will  give  me  four  of  your  marbles,  I  will  have  three  times 
as  many  as  you  will  then  have.'  How  many  marbles 
do  they  both  have?" 

(Assigned  value,  4.8.  Per  cent,  of  eighth  grade 
children  missing,  44.    Number  of  cases,  48.) 

TEST  H,   EXERCISE   Xm 

"If  it  takes  a  man  an  hour  to  walk  around  a  square, 
each  side  of  which  is  a  mile  in  length,  how  long  will  it 
take  him  to  walk  eight  miles?  " 

(Assigned  value,  4.3.  Per  cent,  of  eighth  grade  chil- 
dren missing,  44.    Number  of  cases,  16.) 
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TEST  m,  EXERCISE  m 

"I  have  five  plums  and  Mary  has  four  plums.  Jane 
comes  along  and  we  see  that  she  hasn't  any.  .  .  . 
We  wanted  to  divide  with  Jane  in  such  a  way  that  we 
shall  all  three  have  the  same  number.  I  give  Jane  two 
plums.    How  many  must  Mary  give  her?  " 

(Assigned  value,  3.5.  Per  cent,  of  eighth  grade  chil- 
dren missing,  9.    Number  of  cases,  131.) 

TEST  m,  EXERCISE  V 

"A,  B,  C  and  D  in  the  straight  line  represent  four 
places  lying  in  a  straight  line.  From  A  to  B  is  four 
miles,  from  C  to  D  is  seven  miles,  from  A  to  D  is  fourteen 
miles.    How  far  is  it  from  B  to  C?" 

A B C D 

(Assigned  value,  3.8.  Per  cent,  of  eighth  grade  chil- 
dren missing,  34.    Number  of  cases,  125.) 

TEST  m,   EXERCISE  Vm 

"There  are  three  horizontal  lines;  the  first  is  three 
inches  in  length,  the  second  two  inches,  the  third  one 
inch.  We  know  that  if  the  second  and  third  lines  are 
joined  end  to  end  the  resulting  line  will  be  as  long  as 
the  first  line.  Suppose  that  the  first  and  second  lines 
are  joined  end  to  end.  How  many  times  as  long  as  the 
third  line  will  the  resulting  line  be?" 
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(Assigned  value,  4.8.    Per  cent,  of  eighth  grade  chil- 
dren missing,  29.    Number  of  cases,  82.) 

It  will  be  observed  that  all  of  these  are  simple  problems 
as  far  as  the  numerical  relations  are  concerned.  FTfirfo 
VIII  contains  a  great  many  words,  but  in  the  others  the 
problems  are  clearly  stated,  and  the  situations  are  within 
the  experiences  of  the  children.  In  Tests  I  and  II  the 
arithmetical  problems  come  so  late  in  the  test  that  but 
relatively  few  of  the  children  get  to  them  and  these,  of 
course,  are  either  the  more  able  members  of  the  class  or 
those  who  worked  at  high  rate  with  low  accuracy.  In 
Test  III,  however,  practically  the  full  class  membership 
attempted  problems  3  and  5.  (Table  LXXXIV,  page 
332.)  For  Test  III  two  other  exercises  which  are  non- 
mathematical  have  been  included  for  comparison.  For  in- 
stance, number  4  was :  "  In  the  following  words,  find  one 
letter  which  is  contained  in  only  three  of  them,  and  then 
cross  out  the  word  which  does  not  contain  that  letter": 

ail  thief  live  anvil 

The  results  show  conclusively  that  many  of  the  Gary 
eighth  grade  children  are  unable  to  solve  simple  arith- 
metical problems  when  presented  in  printed  form  and 
under  test  conditions  (average  accuracy,  6  problems, 
69  per  cent.).  On  the  other  hand,  it  must  be  remembered 
that  the  Gary  eighth  grade  scores  in  the  Kansas  Test  are 
almost  exactly  at  the  Kansas  standard,  which  is  also  the 
score  made  by  many  eighth  grade  classes  in  other  cities. 
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Comparative  data  for  the  number  missing  on  the  different 
exercises  of  the  test  are  lacking.  It  may  well  be  that  the 
Gary  children  do  as  well  with  such  problems  as  the  chil- 
dren in  the  conventional  schools.  However,  the  results 
are  presented  for  their  absolute,  not  their  comparative, 
value.  The  reader  must  judge  for  himself,  therefore, 
whether  the  conditions  revealed  by  the  data  above  and 
in  the  tables  are  satisfactory. 

To  the  writer  the  results  seem  poor,  but  whether  the 
difficulty  is  caused  by  poor  training  in  reading  or  in 
arithmetic  he  cannot  tell.  He  prefers,  therefore,  to  take 
the  position  that  reasoning  ability  in  arithmetic  has  not 
been  measured  at  Gary,  and  to  terminate  this  discussion 
with  the  repetition  of  the  statement  that  the  eighth 
grade  children  do  as  well  in  the  Kansas  Reading  Tests 
as  the  children  in  many  other  cities. 

TRABUE  LANGUAGE  SCALES 

The  Trabue  Language  Scales  measure  a  very  complex 
ability.  Their  author  makes  no  claim  that  they  measure 
reading  ability,  but  they  are  classed  with  the  reading 
tests  because  reading  ability  is  one  factor  in  determining 
a  child's  score.  However,  for  these  tests  as  for  the 
Kansas  Tests,  while  a  child  who  cannot  read  cannot  make 
a  high  score,  a  low  score  does  not  necessarily  mean  ina- 
bility to  read. 

If  a  child  scans  such  a  sentence  as  "The  sky  —  blue," 
the  word  "is"  rises  to  consciousness  spontaneously. 
The  test  is,  therefore,  in  one  aspect  at  least,  a  measure 
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of  a  person's  habits  of  association  in  connection  with  the 
use  of  words.    When,  however,  one  reads,  "One  ought 

to great  care  to the  right of ,  for 

one  who bad  habits it to  get  away 

from  them,"  much  more  ability  is  required  to  fill  the 
blanks  correctly  than  the  mere  possession  of  a  given  set 
of  habitual  associations,  for  the  sentence  has  been  so 
mutilated  that  there  are  few  significant  words  to  act  as 
stimuli.  "Take"  or  " exercise "  suggests  itself  readily 
for  the  first  blank,  because  "  take  care "  is  both  said  and 
heard  frequently  by  all;  but  "right"  is  absolutely  mean- 
ingless without  some  further  cue,  as  it  occurs  with  all 
sorts  of  words.  So  with  certain  others  of  the  blanks. 
"Habits,"  however,  stirs  up  a  host  of  associations,  and 
"get  away  from  them"  suggests  habit  formation  and 
habit  correction.  With  this  cue  as  a  basis,  the  intelligent 
child  with  sufficient  initiative  can  go  back  to  the  other 
blanks  and  supply  experimentally  a  whole  series  of 
words  until  a  set  is  found  which  "makes  sense."  This, 
however,  involves  the  exercise  of  several  new  ranges  of 
information  and  power.  In  short,  in  these  tests,  reading 
is  a  very  small  factor  and  general  intelligence  (i.  e.  the 
ability  to  comprehend  a  situation  and  to  bring  all  one's 
resources  to  bear  upon  it  in  an  efficient  manner  in  order 
to  make  an  adjustment  in  it  which  shall  fulfill  a  desired 
end)  is  the  critical  factor.  Perhaps  the  best  statement 
that  can  be  made  is  that  of  the  author:  "Nothing  is 
known  about  what  the  tests  measure,  except  that  in  some 
way  success  is  related  to  language  work  in  school,  and 


READING  335 

scores  in  the  tests  show  a  high  correlation  with  general 
ability." 

The  sentences  composing  the  tests  have  been  so  chosen 
for  difficulty  that  each  is  (approximately)  as  much  easier 
than  the  one  after  it  as  it  is  more  difficult  than  the  one 
before  it.1  In  other  words,  the  difficulty  increases  by 
unit  amounts  from  sentence  to  sentence.  A  child's  score 
in  the  test  is  not  the  amount  done,  but  the  difficulty  of  the 
hardest  sentence  he  is  able  to  complete.  This  fact  is  some- 
what obscured  by  the  method  of  scoring,  which  allows  two 
points  for  each  perfect  answer  and  one  for  imperfect  but 
not  incorrect  answers;  yet,  in  general,  if  a  child  com- 
pletes five  sentences  and  receives  a  score  of  10,  the  scores 
should  be  interpreted  to  mean  that  the  child's  develop- 
ment in  ability  to  complete  sentences  has  reached  10 
in  an  absolute  scale  ranging  from  zero  to  20.  Thus, 
the  test  serves  to  sort  children  into  groups  on  the  basis 
of  their  development.  In  the  eighth  grade  at  Gary, 
measured  with  Scale  B,  there  were  three  children  whose 
maximum  ability  was  6  units,  one  child  whose  maximum 
ability  was  7  units,  six  children  of  8  units  ability,  two 
children  of  9  units  ability,  seven  children  of  10  units 
ability,  six  of  ability  11,  fourteen  of  ability  12,  twenty  one 
of  ability  13,  twenty  six  of  ability  14,  eighteen  of  ability 
15,  eleven  of  ability  16,  three  of  ability  17,  and  one  of 


*For  the  details  in  regard  to  the  tests  and  the  methods  by  which  the 
values  of  the  different  sentences  were  determined,  the  reader  is  referred 
to  Teachers  College,  Columbia  University,  Contributions  .to  Education 
No.  77. 
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Figure  56 
Results  or  Measurement  with  Trasue  Language  Scaii 
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The  sentences  of  different  difficulty  serve  to  sort  the  children  00  the 
basis  of  ability.  The  relative  difficulty  of  the  various  sentences  cow- 
posing  the  tests  represented  by  the  lengths  of  the  row  of  rectangles.  The 
scale  along  the  vertical  axis  shows  the  value  of  these  lengths  in  terms  of 
Trabue  units.  The  figures  on  the  arrows  show  the  number  of  the  119 
eighth  grade  children  tested  who  were  able  to  complete  each  sentence, 
but  not  the  next  higher  sentence.  That  is,  3  children  were  able  to  com- 
plete successfully  the  third  sentence,  but  failed  on  the  fourth  and  all 
sentences  thereafter. 

The  graph  makes  clear  the  great  range  of  ability  found  in  the  eighth 
grade.    This  condition  is  not  peculiar  to  Gary,  however. 
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ability  18  (Figure  56,  page  336).  That  is,  the  119  eighth 
grade  children  tested  range  in  ability  from  Trabue's  sixth 
grade  standard  to  abilities  higher  than  the  twelfth  grade 
standard. 

With  the  Trabue  Scales,  as  with  most  of  the  other 
tests,  the  individual  variation  within  a  grade  is  very 
large,  while  the  increase  in  score  from  grade  to  grade  is 
very  small  (average,  grades  three  to  nine,  14  points). 
A  year  of  school  work  produces  very  little  change  in 
the  group  ability,  yet  within  any  one  class  individual 
children  vary  almost  from  one  extreme  to  the  other. 
Therefore,  the  exact  interpretation  to  be  placed  upon 
results  from  Trabue's  Scale  is  a  matter  of  some  doubt. 
It  is  probable  that  they  should  be  considered  as  affording 
a  reliable  basis  for  comparing  the  general  intelligence 
or  intellectual  development  of  groups  of  children. 

At  Gary  Scales  B  and  C  were  given  and  scored  under  the 
standard  conditions  set  up  by  Trabue.  Scales  D  and  E 
were  also  given,  but  the  time  allowance  was  reduced  from 
seven  minutes  to  two  minutes.  It  was  thought  that  in 
this  time  a  few  of  the  brighter  children  would  finish, 
but  that  most  of  the  children  would  still  be  writing. 
That  is,  the  attempt  was  made  to  change  the  test  from  a 
difficulty  test  to  a  rate  test.  The  resulting  grade  scores 
were,  however,  but  very  slightly  reduced,  so  that  in 
effect  there  were  four  measurements  of  the  Gary  children 
with  the  Trabue  Tests.  An  inspection  of  Table  LXX, 
Section  I,  page  294,  of  this  chapter  will  show  that  the 
grade  scores  are  remarkably  constant.    That  is,  a  single 
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TABLE  LXXXV 


Comparative  Results  of  Different  Trials  or  Trabue  Languagi 

Scales1 
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D  and  E  accuracy  45% 
D  and  E  rate         86% 


*BtMd  oo  tcora  of  At  eighth  grade  pupils. 

This  table  is  to  be  read  as  follows:  55  per  cent  of  the  42  eighth  grade 
children  maintain  the  same  relative  position  when  tested  with  Scale  B 
as  with  Scale  C,  within  one  unit  of  variability  (2  points  of  score).  The 
correspondence  between  scores  from  Scale  Bona  seven  minute  bass 
and  Scale  D  on  a  two  minute  basis  was  greater,  being  62  per  cent,  but 
between  Scale  B  and  Scale  E  was  less,  being  but  38  per  cent.  The  corre- 
spondence between  scores  for  number  of  points  attempted  in  Scale  D  and 
£  was  86  per  cent  but  the  correspondence  for  scores  for  accuracy  of 
work  was  but  45  per  cent 
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test  by  means  of  any  one  of  the  four  scales  yields  a  reliable 
measure  of  the  group.  The  Gary  generalized  scores,  being 
based  upon  the  four  trials,  may  therefore  be  accepted  as 
accurately  reflecting  the  abilities  of  the  Gary  children. 

The  results  of  the  repeated  test  throw  light  upon  the 
question  of  the  reliability  of  the  individual  scores.  The 
coefficients  of  correspondence  between  the  scores  de- 
termined from  the  various  scales  were  found  (Table 
LXXXV,  page  338) .  In  general,  approximately  half  the 
children  will,  in  successive  tests,  maintain  the  same  rela- 
tive position  within  the  group.  When  the  tests  are  given 
with  a  two  minute  time  allowance,  86  per  cent,  of  the 
children  will  maintain  the  same  position  in  two  trials 
as  far  as  rate  scores  are  concerned,  but  only  45 
per  cent,  as  far  as  accuracy  scores  are  concerned.  In 
other  words,  the  high  correlations  which  exist  between 
the  conventional  Trabue  scores  and  the  other  general 
measures  of  ability  in  school  work  are  probably  due  to 
the  fact  that  maximum  scores  are  determined  more  by 
general  intelligence  than  by  specific  abilities. 

The  scoring  of  the  Trabue  Tests  presents  an  interesting 
and  difficult  problem.  The  relative  difficulty  of  the 
different  sentences  was  determined  on  the  basis  of  the 
responses  made  by  children  when  given  tests  of  about  56 
sentences  at  a  time.  In  the  form  used  at  Gary  each 
test  consists  of  but  ten  sentences.  It  is  extremely 
probable  that  the  relative  values  of  the  sentences  would 
be  affected  by  this  change,  but  how  much  is  not  known.1 

IScc  page  296,  footnote. 
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A  second  peculiar  point  is  the  question  of  whether  or 
not  a  sentence  has  the  same  value  to  one  who  cannot 
complete  it  at  all,  to  one  who  can  just  complete  it,  or  can 
complete  it  only  after  effort,  and  to  one  who  can  complete 
it  easily.  Under  the  standard  conditions  such  problems 
do  not  arise,  as  the  time  limit  is  from  three  to  four  times 
as  long  as  is  necessary  for  all  the  children  in  the  upper 
grades  to  try  all  the  problems  once.  But  under  the 
shortened  time  allowance,  the  question  of  the  amount 
of  credit  to  be  given  for  each  sentence  is  an  important 
one.  For  the  Gary  tabulations  the  scores  for  number 
of  sentences  tried  were  based  on  the  regular  allowance 
of  2  points  per  question.  The  conventional  score  was 
taken  as  points  right  and  the  accuracy  score  as  the  rela- 
tion between  the  points  attempted  and  the  points  right 
If  the  sentences  are  considered  to  increase  in  difficulty, 
however,  it  would  probably  have  been  better  to  have 
computed  a  cumulative  score  for  attempts  and  rights; 
that  is,  allow  2  points  for  the  first  question  answered 
and  4  for  the  second  and  so  on.  The  sum  of  all  the 
points  attempted  would  then  be  the  rate  score  and  the 
per  cent,  the  points  right  were  of  the  points  attempted 
would  be  the  accuracy  score.1  However,  as  it  was 
desired  to  make  comparisons  between  the  scores  for 
Scales  C  and  D,  and  Scales  D  and  E,  the  conventional 
method  was  used  as  a  basis  of  scoring  in  both  cases. 


lThe  coefficient  of  correspondence  for  42  eighth  grade  scores  in  Scales 
D  and  £  figured  on  this  basis  was^43  Per  cent*  f°r  rate  an(l  55  per <xnt 
for  accuracy  (one  unit  of  variability). 
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A  tabulation  of  the  rate  and  accuracy  scores  of  42 
eighth  grade  children  in  the  four  trials  of  the  tests 
was  made  to  show  the  differences  in  the  conditions  under 
which  the  children  were  tested  for  the  different  scales. 
In  Tests  B  and  C  about  80  per  cent,  of  the  children 
wrote  something  for  every  question,  but  very  much  of 
what  was  written  beyond  a  certain  point  was  incorrect. 
The  median  accuracy  of  the  group  is  about  70  per 
cent.  In  Tests  D  and  E,  however,  each  child  had  time 
to  reach  the  sentences  which  were  difficult  for  him,  yet 
not  time  enough  to  puzzle  out  answers  that  did  not 
readily  suggest  themselves.  The  median  accuracy  is 
about  5  per  cent,  higher.  But  7  per  cent,  of  the 
children  were  able  to  finish  all  the  sentences,  and  the 
remaining  children  are  shown  in  positions  which  reflect 
their  relative  rates  of  work  as  well  as  the  difficulty  of 
the  sentences  they  are  able  to  complete  in  the  given  time 
(Table  LXXXVI,  pages  342-343). 

The  point  of  the  data  in  the  tables  is  that  any  considera- 
tion of  what  the  results  mean  must  take  into  account 
the  fact  that  the  conventional  Trabue  scores  are  scores  of 
maximum  achievement.  They  tell  nothing  about  the  rela- 
tive efficiency  with  which  the  results  have  been  achieved. 
If  the  results  seem  more  constant  than  in  other  tests,  it 
is  because  the  conditions  under  which  they  are  given  are 
so  controlled  as  deliberately  to  ignore  those  important 
phases  of  a  child's  work  which  differentiate  him  from 
other  children,  such  phases  as  result  in  differences  in  the 
rate  at  which  he  works  and  in  the  quality  of  his  output. 
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Put  in  other  words,  the  conventional  Trabue  scores 
eliminate  very  large  differences  in  degrees  of  skill,  and 
differences  in  the  efficiency  with  which  the  skills  are  used. 
They  reveal  only  the  utmost  which  a  child  is  capable 
of  achieving,  without  regard  to  the  effort  by  which  it  is 
achieved.  Under  such  conditions  it  is  not  surprising 
that  the  results  correlate  highly  with  the  results  from  the 
Binet-Simon  Tests  and  other  tests  of  general  intelli- 
gence.1 But  to  a  corresponding  degree  they  are  not 
measures  of  the  effects  of  classroom  training. 

If  the  Kansas  Reading  Tests  were  given  under  condi- 
tions similar  to  those  required  by  Trabue,  very  different 
scores  would  result.  For  each  of  the  other  tests  used 
an  equivalent  statement  can  be  made.  It  is  important, 
therefore,  that  the  reader  recognize  the  difference  in  the 
conditions  and  make  no  comparisons  between  the  results 
of  the  Trabue  Scales  and  the  results  of  other  tests  without 
keeping  this  fact  in  mind. 

It  is  contended  by  many  persons  that  both  the  Kansas 
Silent  Reading  Tests  and  the  Trabue  Scales  should  not 
be  considered  either  measures  of  reading  ability  or  of 
ability  in  language  work,  but  measures  of  general  in- 
telligence. It  should  be  evident  from  the  discussions 
above  that  neither  of  these  tests  measures  directly  any 
single  product  of  classroom  teaching,  and  both  call 
for  the  exercise  of  much  initiative,  judgment,  and 
reasoning  ability  in  addition  to  the  ability  to  read 

,See  Teachers  College,  Columbia  University,  Contributions  to  Educa- 
tion, No.  77,  p.  77. 
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nd  understand  the  material  of  the  tests.  The  reader 
hould  note  at  this  point  that  if  the  tests  are  considered 
s  measures  of  reading  ability,  they  confirm  the  previous 
esults  in  showing  that  the  Gary  children  do  very  nearly, 
f  not  quite,  as  well  in  reading  as  the  children  in  conven- 
ional  schools.  If  the  tests  are  considered  as  measures 
>f  general  intelligence  and  reasoning  ability  (without 
attempting  to  define  what  those  terms  may  mean), 
hen  they  show  that  the  Gary  children  are  normal  in 
general  capacity  and  intelligence. 

RELATIONS  BETWEEN  TESTS 

Perhaps  the  most  convincing  proof  of  the  fact  that  each 
>f  the  various  reading  tests  used  at  Gary  measures  a 
>articular  phase  of  reading  ability  is  found  in  the  relation 
between  the  individual  scores  in  various  tests.  The 
roeffidents  of  correspondence  were  computed  from  the 
>cores  of  $$  eighth  grade  children,  all  who  were  meas- 
ured in  all  of  the  reading  tests  (Table  LXXXVTI,  pages 
J46  and  347).  Teachers'  marks  refer  to  the  marks  (in 
>er  cent.)  assigned  by  the  teachers  on  the  estimate  card 
shown  in  Figure  53,  page  305.  Time  in  oral  reading  is 
the  number  of  seconds  taken  to  read  a  paragraph  of 
Gray's  Reading  Scale  (based  on  the  average  for  para- 
graphs four,  five,  and  six).  Points  in  oral  reading  rep- 
resent the  conventional  score  in  oral  reading.  Kansas 
scores,  rate  scores,  and  accuracy  scores  have  the 
meanings  previously  indicated,  but  were  determined  by 
averaging  the  two  out  of  three  variability  ratios  that 
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were  nearest  alike.  For  instance,  if  a  child's  score  was 
+1.6  in  one  test,  — 2.7  in  a  second,  and  +1.8  in  a  third, 
his  average  would  be  taken  as  +1.7,  the  low  score 
in  the  second  test  being  rejected  as  a  chance  varia- 
tion. This  procedure  was  adopted  after  comparing  its 
effects  with  results  obtained  from  averaging  the  actual 
scores.  The  ratios  were  deemed  a  better  basis  from  which 
to  work  because  of  the  differences  in  the  difficulty  of  the 
tests.  In  the  case  of  the  Trabue  Tests,  however,  the  actual 
scores  in  Scales  B  and  C  were  averaged  for  score  and  the 
actual  rates  and  accuracies  in  tests  with  Scales  D  and  E 
for  rate  and  accuracy  respectively.  The  rate  of  silent 
reading  is  based  upon  the  average  of  the  two  nearest  varia- 
bility ratios  out  of  the  three  in  the  reading  tests.  For 
the  reproduction  score,  ideas  reproduced  represent  the 
actual  number  of  ideas  reproduced,  while  accuracy  of 
reproduction  expresses  the  relation  between  the  actual 
points  and  the  possible  points.  Rate  of  reproduction 
refers  to  the  number  of  words  reproduced  per  minute. 
Finally,  average  position  was  found  by  averaging  the 
thirteen  variability  ratios  so  far  described. 

The  significance  of  this  last  measure  may  need  com- 
ment. If  a  child  stands  very  high  in  all  tests  he  has  a 
high  average  position.  If  he  does  well  in  some  tests 
and  poorly  in  others,  his  average  position  is  lower.  It 
is  probable  that  a  relatively  constant  position  in  all 
tests  is  an  indication  of  his  natural  capacity,  so  that  aver- 
age position  may  be  regarded  as  a  measure  expressing 
the  general  capacity  of  the  individual  so  far  as  the  capac- 
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ity  is  revealed  by  the  reading  test  (Figure  57,  page  350). 
For  most  of  the  reading  tests  the  degrees  of  corre- 
spondence between  any  two  tests  are  approximately 
constant.    The  distribution  of  coefficients  is  as  follows: 


TABLE  LXXXVIH 

Distribution  of  Coefficients  of  Correspondence  of  Fourteen 
Different  Types  of  Scores  from  Reading  Tests 


bancs  of  cozmcncxr 

20-29 

30 
14 

40 
27 

50 
20 

60 
16 

70 
8 

80 
1 

TOTAL 

Frequency 

5 

91 

The  median  coefficient  is  50.  That  is,  about  half 
the  children  will  maintain  the  same  relative  position  in 
any  two  sets  of  scores  from  reading  tests.  A  coefficient 
of  from  40  to  60  means,  therefore,  only  the  degree  of 
correspondence  which  is  to  be  expected  from  the  fact 
that  all  scores  are,  in  general,  predetermined  by  the  major 
factors  of  heredity,  maturity,  and  training.  Where, 
however,  the  coefficient  of  correspondence  falls  to  30 
or  20  per  cent,  it  signifies  that  one  or  both  of  the  tests 
measure  peculiar  or  specific  abilities.1  When  the  co- 
efficient rises  to  60,  70,  or  80  per  cent,  it  signifies  that 
scores  in  the  last  two  tests  are  determined  more  nearly 
by  the  same  factors.  It  may  be  that  the  factor  is  simi- 
larity in  the  abilities  measured,  or  it  may  be  that  the 
factor  is  general  intelligence,  but  whatever  causes  a  high 
score  in  one  test  causes  a  high  score  in  the  other  also. 


1  Journal  of  Applied  Psychology,  March,  191 7,  Vol.  I,  p.  26. 
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Figure  57 
Individual  Records  in  Reading  Tests 
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The  horizontal  lines  represent  14  different  measures  of  pupil's  ability 
in  reading.  The  vertical  line  marked  "O"  represents  class  median  in 
each  phase  of  reading  ability.  Distances  to  the  left  and  right  of  the 
class  median  represent  positions  above  and  below  in  terms  of  the  vari- 
ability. (Median  deviation.)  The  solid  line  shows  the  position  in  each 
of  the  14  sets  of  results  of  the  member  of  the  class  who  had  the  highest 
average  position.  The  broken  line  represents  a  similar  record  for  the 
member  of  the  class  who  had  the  lowest  average  position.  The  dotted 
line  represents  the  record  of  the  member  of  the  class  whose  average 
position  was  exactly  median. 

The  curve  shows  that  while  for  individual  tests  there  is  a  large  amount 
of  individual  variation,  the  position  of  each  child  shows  a  tendency  to 
vary  about  a  certain  general  level  of  ability.  It  is  probable  that  this 
general  level  is  determined  more  by  capacity  than  by  training. 
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Figure  58 
Average  Position  in  All  Reading  Tests 
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The  scale  along  the  base  of  the  figure  represents  the  individuals  of  a 
group  of  33  eighth  grade  children.  The  solid  line  shows  the  relative 
position  of  each  child  in  the  class  as  determined  by  the  average  of  his 
position  in  each  of  13  sets  of  scores  from  Reading  Tests.  The  broken 
line  shows  the  relative  position  of  the  same  children  as  determined  by  the 
scores  in  Gray's  Oral  Reading  Scale.  Twenty  eight  out  of  33  children, 
or  85  per  cent.,  maintain  the  same  position  within  one  unit  of  variability. 
The  dotted  line  shows  the  relative  position  of  the  same  children  based 
upon  accuracy  scores  in  the  Trabue  Test  when  given  with  a  time  allow- 
ance of  but  two  minutes.  The  percentage  of  correspondence  between 
the  Trabue  scores  and  the  oral  reading  is  33  per  cent. 

The  curves  show  that  the  scores  in  oral  reading  are  probably  deter- 
mined more  by  capacity  than  by  training,  while  accuracy  scores  in  the 
Trabue  Tests  used  as  rate  tests  are  measures  of  a  specific  ability. 
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The  correspondence  between  teachers'  marks  and  score 
for  oral  reading  is  high.  The  correspondence  is  equally 
great  between  teachers'  marks  and  the  Kansas  Silent 
Reading  Tests,  or  between  teachers'  marks  and  the 
measure  of  average  position.  The  correspondence  with 
Trabue  is  also  high.  All  these,  however,  are  measures  of 
general  intelligence.  It  is  extremely  probable,  therefore, 
that  at  Gary  teachers'  marks  reflect  the  general  capacities 
of  the  children.  On  the  other  hand,  rate  of  silent  reading 
and  accuracy  of  reproduction  have  very  little  corre- 
spondence with  the  teachers'  marks. 

A  corollary  of  the  foregoing  conclusions  is  that  at  Gary 
scores  in  Gray's  Oral  Reading  Tests  are  determined  largely 
by  the  capacities  of  the  children.  The  coefficients  between 
the  oral  reading  scores  and  the  measures  of  general  in- 
telligence are  all  high.  The  correspondence  between 
scores  in  oral  reading  and  average  position  is  851  (Figure 
58,  page  351),  a  further  confirmation  of  this  conclusion. 
The  coefficients  also  show  that  time  required  to  read  is  a 
large  factor  in  determining  the  oral  reading  score,  so 
this  in  turn  must  be  determined  largely  by  the  native 
capacities  of  the  children. 

The  coefficients  for  the  different  types  of  scores  for  the 
Kansas  Silent  Reading  Tests  and  for  the  Trabue  Tests 
are  interesting  and  significant.  The  conventional  scores 
in  these  tests  show  considerable  correspondence  with 
measures  of  general  ability  but  very  little  with  specific 
abilities.    Accuracy,  or  degree  of  understanding,  is  & 

Pearson's  Coefficient  of  Correlation,  +•  73»  P.  E.  ±  .05. 
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specific  phase  of  skill  in  reading,  and  like  all  other  specific 
abilities  shows  a  low  degree  of  correspondence  with  other 
abilities  which  are  equally  specific. 

The  scores  for  rate  of  silent  reading  show  considerable 
correspondence  with  the  time  scores,  and  with  scores 
for  average  position,  but  with  all  accuracy  scores  the 
correspondence  is  low.  Apparently  at  Gary  the  rapid 
readers  are  not  those  who  read  understandingly.  It 
is  probable  that  the  rate  of  reading  scores  and  all  the 
reproduction  scores  are  measures  of  general  abilities, 
but  here  again  the  general  correspondence  is  lower  for 
accuracy  of  reproduction  than  for  the  other  abilities. 

It  should  be  remembered  that  these  results  are  based 
upon  very  few  data  and  have  significance  only  for  Gary. 
Even  for  Gary  the  chief  point  to  the  foregoing  discussion 
is  that  the  scores  of  children  in  the  various  tests  vary  in 
every  conceivable  fashion.  That  is,  the  relation  be- 
tween abilities  or  the  dependence  of  one  ability  upon 
another  varies  from  child  to  child.  Each  test,  so  far  as 
it  measures  a  specific  ability,  will  yield  scores  which  are 
significant  for  that  test  alone.  Therefore,  no  general 
comparisons  have  been  made.  Only  certain  phases  of 
reading  work  have  been  measured  and  conclusions  drawn 
are  to  be  interpreted  as  applying  to  these  particular 
phases  alone. 

CONCLUSION 

The  foregoing  discussion  must  have  rendered  evident 
the  truth  of  the  statement  previously  made  that  the 
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measurements  of  reading  ability  at  Gary  are  much  less 
satisfactory  than  for  other  subjects.  But  at  least  the 
conventional  reading  tests  have  been  given  carefully,  and 
as  much  is  known  about  the  reading  abilities  of  the  Gary 
children  as  such  tests  reveal. 


Vm.    FACTORS  AFFECTING  PERFORMANCE 

MEASUREMENT  of  classroom  products  is  or- 
dinarily accomplished  by  giving  standard  tests 
under  controlled  conditions.  The  testing  ac- 
tivity results  in  a  series  of  figures  (scores)  describing 
either  quantitatively  or  qualitatively  the  way  the  children 
behaved  under  the  test  conditions.  The  question  imme- 
diately arises:  What  relations  do  the  scores  made  by 
the  children  in  standard  tests  bear  to  their  real  abilities? 
A  little  reflection  will  show  that  the  results  of  tests 
are  affected  by  many  factors.  For  instance,  nervousness 
may  lower  a  child's  score  for  accuracy  of  work  in  addition 
from  ioo  to  o  per  cent.,  while  recent  study  or  practice  on 
a  particular  test  may  lead  to  scores  far  above  the  normal 
level.  Hence,  the  score  of  a  child  in  a  test  is  a  reliable 
measure  of  just  one  thing,  what  he  did  in  that  test. 

HEREDITY 

The  basic  factor  in  the  performance  of  each  individual 
is  his  heredity,  or  capacity.1  That  children  differ  in 
capacity  is  the  common  experience  of  all,  and  these 

1The  three  technical  terms  used  in  this  discussion— pcrfonnancc,  ability, 
and  capacity1 — may  be  defined  as  follows: 

Performance  is  the  specific  achievement  (actual  score)  made  in  a 
particular  test.    Ability  is  the  general  power  to  perform.    It  is  best 
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differences  are  observable  from  birth.  So  far  as  the 
Gary  results  are  concerned,  the  effects  of  heredity  are 
either  negligible  or  unknown.  In  any  large  unselected 
group  of  children  the  percentages  of  individuals  of  ex- 
ceptional ability,  of  average  ability,  and  of  small  ability 
are  probably  constant.  If,  as  seems  probable,  unusually 
large  numbers  of  the  Gary  children  are  born  of  foreign 
parents,1  they  might  form  a  selected  group  as  to  capacity, 
provided  there  were  marked  racial  differences  in  capacity 
from  country  to  country.  However,  the  facts  in  regard 
to  all  such  hypotheses  are  wanting.  It  is  known  that 
the  children  in  the  Gary  schools  come  from  a  wide  variety 
of  racial  stocks.  Therefore,  as  a  group,  the  Gary  children 
probably  do  not  differ  greatly  in  their  basic  capacities 
from  the  average  of  children  in  other  cities. 

For  example,  the  Gary  eighth  grade  children  copy 
figures  at  the  rate  of  in  figures  per  minute  as  compared 
with  the  score  of  108  figures  per  minute  for  children  in 
other  cities  (Table  LXXXIX,  page  358,  Figure  59,  page 
359).  The  activity  involved  here  is  almost  entirely  the 
motor  activity  in  writing  figures,  and  no  direct  training 
of  this  character  is  given  in  the  schools.  The  differences 
between  the  Gary  and  the  country  wide  results  in  this 

inferred  from  the  median  performance  in  a  series  of  trials  of  the  tests 
since  the  amount  of  variation  shown  by  the  series  as  a  whole  furnishes 
a  measure  of  the  reliability  of  the  inference.  Capacity  is  potential,  or 
undeveloped  ability,  the  possibilities  of  development  inherent  in  a  child's 
original  nature.  For  an  extended  discussion  of  these  definitions  see 
Bulletin  No.  4,  Courtis  Standard  Research  Tests. 

'    1See  The  Gary  Public  Schools:    A  General  Account 
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test  are  not  significant  and  are,  if  anything,  in  favor  of 
the  Gary  children.  In  the  Kansas  Silent  Reading  Test 
and  in  the  Trabue  Test,  which  are  considered  by  many  as 
measures  of  general  intelligence,  the  differences  between 
the  Gary  results  and  those  from  other  cities  are  small.1 
In  view  of  all  the  facts  that  are  available,  the  author  con- 
siders it  probable  that  the  Gary  children  represent  a  nor- 
mal group  as  far  as  inheritance  of  average  mental  capacity 
is  concerned.2 

MATURITY 

The  second  important  general  factor  affecting  per- 
formance is  maturity.  By  maturity  is  meant  that  in- 
crease in  ability  from  grade  to  grade  caused  not  by  addi- 
tional purposive  school  training,  but  by  the  greater 
development,  increased  vigor,  and  riper  experience  due 
to  added  age. 

For  instance,  in  the  first  trial  of  the  test  of  copying 
figures,  53  figures  per  minute  were  written  by  the  third 
grade  children.  This  proves  that  as  a  result  of  the  train- 
ing in  the  early  grades  the  ability  to  write  figures  is 
well  developed  by  the  third  grade.     The  grade  scores 


*See  Chapter  VII,  page  29a 

•In  Gary  there  is  a  formal  organization  of  health,  dental,  and  psy- 
chological clinics,  and  several  classes  are  composed  wholly  of  children 
whose  mental  condition  is  such  that  they  are  not  the  equal  of  normal 
children.  The  scores  of  such  classes  are  not  included  in  the  tabulations 
of  the  preceding  chapters.  The  Gary  results  would  be  lower  if  the  chil- 
dren from  the  special  classes  had  been  distributed  through  the  grades 
on  the  basis  of  their  ages. 
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Figure  59 
Development  in  Copying  Figures 

THE  GARY  SURVEY 
CIT.Y_ .t«n»  COPYING.  FIGURES 


T*7  STANDARD 
.*'^TlR3T  TRIAL 


'  ,'  n*  MCOIAN  or  FIVE  TRIAL* 

0* 


*  *  4  S  6  7  •  t  li  II  O 

GRADES 

The  scale  along  the  base  of  the  figure  represents  grades.  The  scale 
long  the  vertical  axis  at  the  left  represents  the  number  of  figures  copied 
er  minute.  The  heavy  curve  in  the  figure  is  based  upon  the  median 
cores  of  the  various  grades.  Taking  into  consideration  the  records 
lade  by  each  individual  in  the  five  separate  trials  of  the  test,  the  median 
cores  based  upon  the  results  of  the  first  trial  are  shown  by  the  broken 
ne.  The  standard  scores  based  upon  records  of  about  60,000  children 
iiroughout  the  country  are  shown  by  the  light  dotted  line. 

Differences  between  the  standard  and  the  Gary  scores  are  not  signifi- 
ant,  but  tend  to  show  that  on  the  first  trial  the  Gary  children  reacted 

0  the  test  situation  more  slowly  than  the  children  of  the  average  school, 
ut  that  on  the  basis  of  the  median  of  five  trials  the  scores  are  higher 
ban  those  of  the  average  school.    They  also  show  that  development 

1  this  test  at  Gary  follows  very  closely  the  conventional  rates,  so  that 
he  Gary  children  in  respect  to  the  abilities  measured  by  this  test  are 
•robably  a  normal,  that  is,  an  unselected,  group. 
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change  from  53  figures  per  minute  in  the  third  grade  to 
120  figures  per  minute  in  the  twelfth  grade.  This 
change,  however,  is  not  due  to  increased  knowledge  or 
to  direct  training,  but  solely  to  that  increase  in  ability 
which  comes  from  increased  maturity. 

It  may  be  objected  that  the  ability  to  copy  figures  is  a 
direct  product  of  school  training.  This  is,  of  course, 
true  in  a  general  sense.  If  the  children  had  received  no 
training  whatever  in  copying  figures,  their  scores  would 
all  have  been  zero,  or  nearly  so.  However,  no  specific 
training  in  copying  figures  is  given  as  such.  Whatever 
the  activity  to  be  tested,  some  form  of  motor  response 
will  be  an  essential  part  of  the  total  response  and  the 
actual  performance  of  the  child  will  be  influenced  by  his 
whole  past  life.  This  is  not  what  we  ordinarily  mean  by 
school  training.  The  increases  discussed  above  represent, 
of  course,  the  increase  due  to  maturity,  pure  and  simple, 
plus  the  transfer  due  to  the  general  training  of  daily 
school  and  home  life.  This  does  not  mean  that  a  test  in 
copying  figures  can  be  used  to  measure  maturity,  merely 
that  maturity  is  a  factor  contributing  to  the  change  in 
score. 

Both  at  Gary  and  in  the  results  based  upon  thousands 
of  children  in  many  school  systems,  the  third  grade  score 
is  58  per  cent,  of  the  eighth  grade  score,  the  fourth  69  per 
cent.,  and  so  on  (Table  XC,  page  361).  The  corre- 
spondence between  the  rate  of  increase  of  score  at  Gary 
and  in  school  children  generally  is  almost  perfect  That 
is,  for  both  the  Gary  children  and  children  in  general  the 
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8 

4 

5 

6 

7 

8 

Guy. 

Gary  based  on  115  as  the  eighth  grade 

59 
67 
58 

70 
68 

69 

$2 
79 
78 

89 
86 
86 

97 
94 
92 

100 

Standard  based  on  66,837  individual 

The  scores  of  Table  LXXXIX  are  here  expressed  as  percentages  of 
the  eighth  grade  score. 

The  table  is  to  be  read  as  follows:  The  Gary  third  grade  in  copying 
figures  is  59  per  cent,  of  the  eighth  grade  score;  the  fourth  grade  score 
it  70  per  cent,  of  the  eighth  grade  score,  and  so  on.  In  Table  LXXXIX, 
page  358,  the  eighth  grade  score  is  lower  than  it  should  be  compared 
with  the  seventh  and  ninth  grade  scores.  The  average  of  the  seventh 
and  ninth  grade  scores  is  its  figures  per  minute. 

The  second  line  in  the  table  above  is  to  be  read  as  follows:  The  Gary 
third  grade  score  in  copying  figures  is  57  per  cent,  of  115,  the  fourth 
grade  68  per  cent.,  and  so  on.  On  this  basis,  the  rate  of  development 
at  Gary  is  almost  precisely  that  determined  from  the  scores  of  66,837 
children  in  schools  in  many  states. 

rate  of  increase  in  this  test  is  determined  by  the  rate  of 
increase  of  maturity  because  it  is  not  determined  by  ex- 
press school  training  directed  to  that  end. 

If  the  results  from  the  different  tests  at  Gary  are  stud- 
ied from  this  point  of  view  it  will  be  found  that  the  vari- 
ous scores  for  rate  of  work  may  be  divided  into  three  groups 
or  classes  (Table  XCI,  page  363).  Class  A  includes  three 
tests,  copying  figures,  oral  reading,  and  multiplication 
tables,  in  which  the  motor  and  mental  activities  involved 
have  been  quite  fully  habituated  earlier  than  the  lowest 
grades  in  which  the  testing  work  begins.    Most  children 
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are  able  to  do  the  thinking  called  for  in  these  tests  much 
more  readily  than  they  can  write  or  speak,  so  that  their 
rate  of  work  is  determined  almost  wholly  by  the  rate  of 
motor  activity.  At  Gary  all  three  sets  of  scores  show 
the  same  rate  of  development.  The  inference  to  be 
drawn  from  these  data  is  that  the  instruction  in  oral 
reading  and  in  the  multiplication  tables  has  produced 
no  greater  effects  upon  the  rate  of  developement  of  the 
abilities  of  the  Gary  children  in  these  tests  than  the  gen- 
eralized training  has  upon  the  ability  to  copy  figures. 
Ability  develops  at  a  rate  which  is  the  same  for  all,  and 
which  is  fully  accounted  for  by  the  increase  in  maturity 
or  general  training. 

In  class  B  are  included  the  rate  scores  for  five  other 
tests,  the  cancellation  test,  the  rate  of  writing  in  the  free 
choice  test,  in  the  reproduction  test,  in  the  composition 
test,  the  rate  of  adding  in  Series  B,  and  the  rate  of  an- 
swering questions  in  the  Kansas  Silent  Reading  Test 
In  all  these  activities  the  motor  element  is  dependent 
upon,  and  controlled  by,  the  mental.  A  child  cannot 
cancel  triangles  faster  than  he  can  discriminate  between 
the  different  forms;  he  cannot  write  the  answers  to  addi- 
tion examples  faster  than  he  can  think  the  sum  of  the 
successive  addends;  he  cannot  write  the  words  in  the 
composition  test  until  he  has  determined  what  words 
are  to  be  written.  The  scores  of  the  tests  in  class  B, 
therefore,  represent  activities  of  quite  a  different  type 
from  those  in  class  A. 

The  children  at  Gary  could  not  have  received  any  di- 
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rect  training  for  the  test  in  canceling  triangles,  as  it  was 
comparatively  a  new  test,  not  in  general  use,  and  not 
related  to  conventional  school  work.  Yet  the  rate  oi 
relative  development  in  the  cancellation  test  is  almost  the 
same  as  the  rate  of  development  in  addition  and  in  the 
various  other  tests  of  the  group.  For  all  tests,  the  average 
third  grade  score  is  39  per  cent,  of  the  eighth  grade  score, 
the  fourth  grade  52  per  cent.,  the  fifth  grade  64  per  cent, 
and  so  on.  Moreover,  these  percentages  correspond 
(except  in  the  lowest  grades)  with  the  rate  of  develop- 
ment of  the  strength  of  grip  of  boys  and  girls  in  the 
elementary  grades  (derived  from  Smedley's  measure- 
ment of  6,000  children  in  the  Chicago  schools). 
Under  the  circumstances,  it  is  plainly  to  be  seen  that 
rate  scores  in  the  tests  of  class  B  are  determined  by 
maturity  and  general  training  rather  than  by  the  direct 
effects  of  school  work. 

In  this  connection  the  reader  is  referred  to  Figure  25 
which  is  reproduced  here  as  Figure  60.  The  difference 
between  the  two  curves  in  the  graph  represents  the 
difference  between  the  Gary  product  and  the  conven- 
tional product,  or  the  difference  between  incidental 
development  and  development  under  formal  training. 
In  other  words,  there  is  no  evidence  that  the  Gary 
children  develop  in  ability  any  more  rapidly  because  of 
the  training  received  in  school  than  they  would  develop 
if  they  left  school  at  the  fourth  grade  and  were  subject 
merely  to  the  general  training  of  life.  The  increase  in 
score  in  addition  would  seem  to  be  due  almost  wholly  to 
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FlGUKE   60 

Devcofhent  in  Speed  and  Accukact— Addition 
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Comparison  of  development  in  addition  at  Guy  with  median  develop- 
ment in  rate  and  accuracy  in  addition  based  upon  a  tabulation  of  the 
results  from  tests  of  thousands  of  children  in  cities  of  many  different 
types. 

The  scale  along  the  top  of  the  figure  represents  rate,  or  the  number 
of  examples  attempted  (speed).  The  scale  along  the  left  hand  side  of 
the  figure  represents  the  ratio  of  examples  right  to  examples  attempted, 
or  the  accuracy  of  work  expressed  in  per  cent.  Each  point  of  the 
diagram,  therefore,  represents  two  scores,  raio,aiid  accuracy.  The  posi- 
tion of  the  circle  marked  "4"  on  the  general  curve  (broken  line)  repre- 
sents a  rate  of  7.4  examples  attempted  and  64  per  cent,  accuracy. 

The  curve  for  the  Gary  results  is  shown  by  the  heavy  line.  The  circles 
indicate  the  position  of  the  different  grade  scores.  The  twelfth  grade 
score  in  rate  falls  between  the  sixth  and  seventh  grade  score  on  the 
general  curve,  and  in  accuracy  is  slightly  below  the  fifth  grade  level. 
The  eighth  grade  Gary  results  are  not  quite  equal  to  the  general  fifth 
grade  scores  in  rate,  and  very  much  lower  than  the  fourth  grade  in 

The  position  of  the  general  curve  below  the  fourth  grade  is  not  very 
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Figure  60— -Continued 

reliable,  but  the  differences  between  the  general  character  of  the  Gary 
curve  and  that  of  the  general  curve  is  marked.  The  general  curve 
indicates  that  the  development  of  skill  in  addition  is,  in  the  con- 
ventional school,  nearly  completed  by  the  end  of  grade  five, 
while  the  Gary  curve  shows  that  there  is  a  very  small,  but  regular, 
increase  in  rate  and  accuracy  from  grade  to  grade  op  to  the  end 
of  the  high  school  years.  In  high  school  years  there  is  no  direct  training 
for  the  development  of  skill  in  addition,  so  the  progress  from  grade  to 
grade  must  represent  either  incidental  training  or  the  effect  of  the 
elimination  of  the  less  able  by  non-promotion.  Therefore,  the  Gary 
curve  as  a  whole  would  seem  to  indicate  that  growth  in  skill  in  addition 
in  all  grades  is  due  mainly  to  the  same  causes,  and  very  little  to  direct 
training. 

maturity  and  not  to  direct  training.  If  this  resulted  in 
adequate  rate  and  accuracy  of  work,  no  greater  com- 
mendation of  the  Gary  system  could  be  given.  But  the 
levels  of  ability  developed  are  not  adequate,  and,  in  view 
of  the  attention  given  to  formal  drill,  the  figures  show 
merely  the  extent  to  which  the  classroom  training  fails 
to  function.1 

Class  C  in  Table  XCI,  page  363,  includes  the  scores  of  a 
number  of  other  tests  which  have  very  different  rates  of 
development.    They  are  given  to  prove  that  the  figures 

Perhaps  it  is  well  to  point  out  that  if  any  part  of  the  low  scores  at 
Gary  were  due  to  the  care  with  which  the  tests  were  given  and  scored, 
the  effect  of  making  the  Gary  scores  comparable  with  those  from  other 
cities  would  be  merely  to  shift  the  position  of  the  Gary  curve  in  the  figure, 
not  to  change  its  character.  In  the  opinion  of  the  author,  Figure  60  and 
the  tables  of  this  chapter  are  satisfactory  evidence  that  the  Gary  scores 
reflect  a  real  condition,  and  not  merely  the  effect  of  some  unusual  element 
in  the  testing  conditions. 
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>reviously  quoted  are  not  the  results  of  chance.  A  child 
rannot  learn  long  division,  for  example,  or  how  to  multi- 
ply one  fraction  by  another  by  any  ordinary  form  of 
nddental  training,  and  the  rate  of  development  for  such 
ictivities  is  very  different  from  those  given  in  the  other 
two  parts  of  the  table.  For  instance,  in  accuracy  of  work 
in  the  four  operations  with  fractions,  the  fourth  grade 
accuracy  is  58  per  cent,  of  the  eighth  grade  development, 
but  the  increase  in  the  next  two  grades  is  very  slight, 
and  in  the  following  two  grades  the  increase  is  very- 
rapid  (Table  XCII, pages 370-1).  In  other words,in  those 
grades  in  which  there  was  practically  no  training,  the  ac- 
curacy scores  are  nearly  stationary,  but  as  soon  as  training 
begins  they  develop  rapidly.1 

From  the  foregoing  discussion  it  should  be  evident 
that  in  making  comparisons  from  city  to  city  the  ma- 
turity of  the  children  must  be  taken  into  consideration. 
[f  the  children  in  one  city  are  much  older  for  the  grade 
than  those  of  another  city,  the  rate  scores  made  in  any 
test  in  the  first  city  would  be  almost  invariably  higher 
than  those  in  the  second  city,  even  though  the  two  were 
really  equal  in  educational  efficiency.  As  has  already 
been  mentioned,  in  Gary  tabulations  of  the  ages  of 
children  show  that  in  the  Froebel  school  in  some  grades 
the  children  are,  on  the  average,  a  year  older  than  the 
children  of  the  corresponding  grades  in  the  other  schools 


'For  the  benefit  of  those  who  would  like  to  check  the  conclusions  above 
jy  reference  to  data  from  other  surveys,  such  data,  arranged  in  form  simi- 
lar to  the  Gary  results,  will  be  found  in  Appendix  A,  page  397. 
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of  the  city.  It  has  also  been  pointed  out  that  in  spite  of 
this  advantage  of  greater  maturity,  the  actual  results 
in  the  Froebel  school  are  in  most  tests  lower  than  the 
average  of  the  city.  The  reason  for  this  is  that  the  gain 
due  to  maturity  is  offset  by  certain  handicaps  (foreign 
parentage,  etc.)*  On  the  other  hand,  the  grade  scores 
of  the  Froebel  school  in  the  test  in  copying  figures  are 
uniformly  equal  to,  or  a  little  above,  those  of  the  city  as 
a  whole,  although  individual  classes  may  be  found  both 
above  and  below  the  city  standard. 

The  interpretation  of  the  differences  in  score  from  one 
city  to  another  is,  therefore,  no  simple  matter,  and  the 
relative  effects  of  many  forces  must  be  considered.  On 
the  average,  of  course,  the  age  of  children  per  grade  will 
be  fairly  constant  from  city  to  city,  and  the  effect  of  ma- 
turity negligible.  For  the  Gary  results  the  positive  knowl- 
edge that  conditions  in  respect  to  over  age  are  probably 
normal  or  better  makes  it  possible  to  say  that  the  low 
scores  shown  in  preceding  chapters  are  not  to  be  attrib- 
uted in  any  way  to  maturity. 

Similarly,  in  the  interpretation  of  individual  results  no 
set  procedure  can  be  followed.  Other  things  being  equal, 
the  older  child  will  make  the  larger  score,  but  physiologi- 
cal age  and  chronological  age  are  independent  variables, 
and  the  effects  of  differences  of  maturity  may  be  entirely 
hidden  by  differences  in  capacity,  home  training,  etc.  As  a 
matter  of  fact  the  forces  are  so  many  and  their  methods  of 
operating  so  obscure  that  no  inferences  may  be  safely  made 
in  the  absence  of  certain  knowledge  of  all  the  conditions. 
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TRAINING 

The  third  major  factor  which  influences  performance 
is  school  training.  When  the  test  exercises  are  similar 
in  every  way  to  the  daily  work  of  the  classroom,  perform- 
ances in  the  test  may  directly  reflect  school  training. 
The  list  tests  in  spelling  are  almost  perfect  examples  of 
tests  of  this  type.  Had  the  lists  been  based  upon  the 
words  taught  by  the  teacher  during  the  term,  and  had 
they  been  given  in  the  regular  course  of  a  day's  work 
without  special  papers  and  without  special  knowledge 
on  the  part  of  the  children  of  the  character  of  the  work 
being  done,  performance  would  have  been  even  more 
largely  determined  by  school  training.  Finally,  if 
several  initial  tests  of  equal  difficulty,  based  on  the  same 
words  had  been  given  at  the  beginning  of  the  term,  so 
that  each  individual's  initial  ability  was  known,  a  com* 
parison  of  the  final  with  the  initial  performance  would 
have  revealed  the  effect  of  school  training,1  or  the  changes 
primarily  produced  by  teaching  effort. 

The  educational  tests  given  at  Gary  did  not  have  for 
their  purpose  the  direct  determination  of  the  effects  of 
teaching  effort,  but  the  measurement  of  all  the  forces 


This  does  not  mean  at  all  that  changes  would  be  the  same  for  each 
pupil.  The  actual  response  an  individual  makes  to  training  is  also  de- 
termined by  heredity,  maturity,  past  training,  and  present  conditions. 
It  does  mean  that  school  training  would  be  the  major  force  acting  to 
produce  change  and  that  the  changes  which  took  place  would  be  a  meas- 
ure of  the  effects  of  this  force.  In  case  of  sickness  or  other  special  con- 
ditions, of  course,  the  statement  would  not  be  true  at  all. 
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acting,  of  which  school  training  is  one.  For  example,  if 
the  idea  of  comparing  the  efficiency  of,  say,  the  Boston 
and  the  Gary  schools  comes  to  mind,  what  is  meant  by 
such  a  comparison  is  the  equality  or  inequality  of  the 
final  product.  The  questioner  asks,  in  effect:  "Will 
the  Gary  children  at  any  grade  level  be  more  capable  of 
performing  school  tasks,  and  of  meeting  the  demands 
of  child  and  adult  life  outside  of  school,  than  Boston 
children  of  the  same  age  and  grade?"  He  does  not  ask 
the  narrower  and  more  difficult  question:  Does  the 
Gary  training,  hour  for  hour,  produce  greater  or  less 
effects  than  the  Boston  training?  He  assumes  that  if 
the  training  is  more  efficient,  the  resulting  product  wifl 
be  correspondingly  greater. 

In  attempting  to  answer  such  a  question,  we  assume 
that  differences  in  the  effects  of  the  factors  of  heredity, 
maturity,  all  training  except  that  given  by  the  school, 
and  all  other  special  conditions,  are  negligible  and  that 
difference  in  school  training  is  the  one  factor  that  causes 
differences  in  results.  The  reader  will  readily  see  that 
this  means  that  all  the  effects  of  street  life,  home  life, 
and  all  other  forms  of  training  acting  to  produce  ability 
are  taken  as  being  identical  from  city  to  city  and  from 
state  to  state,  when  a  moment's  reflection  shows  that 
they  cannot  possibly  be  identical.  Yet  no  other  course  is 
possible  at  present.  Nevertheless,  the  fact  should  be  kept 
in  mind  that  the  products  measured  are  resultants  of  all 
the  training  factors  acting  and  not  the  narrower  and  more 
specific  products — the  changes  produced  by  teaching  ef- 
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fort  alone.  Hence  the  care  taken  above  to  consider  the 
possible  effects  of  each  factor  that  might  have  operated 
to  produce  the  low  scores  at  Gary. 

YEARS  SPENT  IN  GARY 

In  this  connection  one  special  factor  needs  discussion, 
the  influence  that  the  number  of  years  of  training  received 
at  Gary  has  upon  the  results.  Gary  is  a  rapidly  growing 
city,  large  numbers  of  children  being  new  to  the  school 
each  year.  An  interesting  and  important  question  is: 
"What  have  the  tests  to  show  in  regard  to  the  abilities 
of  such  children  in  comparison  with  the  abilities  of 
children  who  have  received  part  or  all  of  their  training 
in  the  Gary  schools?" 

Three  types  of  ability  were  selected  for  this  study, 
accuracy  of  work  in  the  List  Spelling  Tests,  rate  and 
accuracy  in  the  arithmetic  test,  and  rate  and  accuracy 
in  the  Kansas  Reading  Test.  Tabulations  were  made 
separately  for  those  groups  of  children  in  the  seventh 
and  eighth  grades  who  had  been  in  the  Gary  classes  one, 
two,  three,  four,  five,  six,  seven,  eight,  and  nine  years. 
Unfortunately,  however,  records  of  entrance  into,  and 
progress  through,  the  Gary  schools  were  found  to  be  imper- 
fectly kept.  Consequently  the  data  in  regard  to  the 
number  of  years  in  Gary  are  not  as  reliable  as  one  would 
like.  The  results  given  in  the  following  tables  are, 
however,  based  upon  the  official  records  of  the  school, 
and  not  upon  the  children's  statement  as  to  the  length 
of  time  spent  in  Gary;  for  from  ten  to  twenty  per 
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cent,  of  the  cases  no  information  whatever  could  be 
secured. 

Tabulations  were  made  in  spelling  and  in  the  Kansas 
Reading  Test  for  the  seventh  grade  as  representing  a 
larger  number  of  cases  than  the  eighth  grade.  For 
spelling  similar  tabulations  were  made  for  the  eighth 
grade.  Out  of  a  total  of  260  seventh  grade  children,  the 
number  of  individuals  who  have  lived  in  Gary  one,  two, 
three,  etc.,  years  is  fairly  constant  (up  to  six  years), 
averaging  30  to  each  group. 

The  results  admit  of  no  simple  interpretation.  The 
children  who  have  been  in  Gary  five  or  six  years  (score 
in  spelling,  five  years,  57.9  per  cent.;  six  years,  52.6  per 
cent.)  do  quite  or  nearly  as  well  as  those  who  have  entered 
during  the  year  1915-1916  (score  57.9  per  cent.)  (Tables 
XCIII,  page  377,  and  XCIV,  page  378).  On  the  other 
hand,  children  who  have  been  in  the  Gary  schools  two  or 
three  years,  or  seven,  eight,  or  nine  years,  more  often  have 
lower  scores  than  those  who  have  just  entered,  or  who 
have  been  in  attendance  for  five  or  six  years. 

Several  explanations  are  possible,  and  in  the  absence 
of  definite  information  no  certain  conclusions  can  be 
drawn.  On  the  face  of  the  results  it  would  seem  that 
for  two  or  three  years  following  entrance  into  the  Gary 
schools  a  period  of  readjustment  to  the  new  conditions 
follows  in  which  the  abilities  of  the  children  are  low. 
However,  for  those  children  who  have  spent  the  major 
portion  of  their  school  lives  in  the  Gary  schools  the  re- 
sults are  equal  to  the  results  obtained  in  the  schools  from 
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Table  XCIV— Continued 

This  table  is  to  be  read  as  follows:  Of  260  seven  tn  grade  children,  31 
were  reported  as  having  been  in  the  Gary  schools  one  year  or  less.  Of 
these  31,  two  made  100  per  cent  accuracy  in  the  List  Spelling  Test,  four 
made  95  per  cent,  three  00  per  cent,  and  so  on  down  to  three  who  made 
but  25  per  cent  The  average  score  for  the  group  was  70.5  per  cent. 
The  average  score  of  the  entire  seventh  grade  was  61.9  per  cent  X 
represents  a  group  of  26  children  for  whom  no  information  in  regard  to 
years  spent  in  the  Gary  schools  was  available. 

The  results  as  a  whole  show  no  clear  relation  between  children  who 
have  spent  five  or  six  years  in  the  Gary  schools  and  those  who  have  just 
entered.  Apparently  the  children  who  have  received  all  their  training 
in  the  Gary  schools  do  as  well  as  those  who  have  received  the  major 
portion  of  their  training  elsewhere. 

which  the  new  children  come.  Finally,  it  would  seem  that 
the  scores  made  by  the  children  who  started  in  the  Gary 
schools  when  buildings  were  small  and  the  system  undevel- 
oped have  a  tendency  to  be  lower  than  the  scores  made  by 
those  who  have  had  the  benefit  of  recent  improvements. 

On  the  other  hand,  it  may  be  that  the  quality  of  the 
material  drawn  to  Gary  at  different  years  has  varied 
greatly;  that  during  certain  years  the  newcomers  have 
been  more  able,  or  better  trained  children,  and  that  dur- 
ing other  periods  the  new  material  has  been  of  lesser 
ability.  For  a  complete  discussion  of  the  results  it  would 
be  necessary  to  determine  not  only  the  length  of  time  in 
Gary,  but  the  source  from  which  the  children  are  drawn. 
It  is  known  that  many  of  the  workers  in  the  steel  mills 
come  from  small  mill  towns  in  Pennsylvania,  and  from 
rural  communities  in  the  South  and  West.  In  other 
words,  the  writer  interprets  the  data  to  mean  that  prob- 
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ably  the  newcomers  in  Gary  are  drawn  from  small  com- 
munities and  have  had  poor  educational  training. 

A  study  of  the  frequency  with  which  each  type  of 
ability  occurs  in  the  different  grades  shows  that  the 
children  who  spell  with  perfect  accuracy  occur  in  every 
division  from  those  containing  the  scores  of  children  who 
have  just  arrived  in  Gary  to  the  scores  of  those  who  have 
spent  five  years  or  more  in  the  Gary  schools,  that  children 
with  grades  higher  than  80  per  cent,  (standard  76  per 
cent.)  are  found  in  considerable  number  in  every  group. 
In  similar  fashion,  in  every  group  will  be  found  children 
who  spell  less  than  30  per  cent,  of  the  words  of  the  test 
correctly.  Similar  conditions  could  be  shown  for 
other  tests.  In  other  words,  a  child  of  great  native 
capacity  who  has  spent  all  his  life  in  the  Gary  schools 
may  attain  a  perfect  score  in  the  spelling  tests  in  spite 
of  the  inefficiency  of  the  general  training. 

STATES 

The  fourth  group  of  major  factors  affecting  perform- 
ance is  best  described  as  the  physical,  mental,  or  emo- 
tional states  of  the  children  tested.  Hunger,  disease, 
fatigue  affect  performance,  as  do  also  fear,  nervousness, 
interest,  etc.  However,  there  is  no  reason  to  think  that 
the  conditions  at  Gary  with  respect  to  all  such  factors 
differ  from  those  in  other  cities  except  so  far  as  the  pro- 
portion of  foreign  children  is  greater  at  Gary  than  in 
other  cities.  In  any  city  on  any  one  day  a  few  children 
are  likely  to  have  headaches,  or  to  be  otherwise  indis- 
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posed,  a  few  children's  emotional  equilibrium  will  be  up- 
set by  the  idea  of  taking  a  test,  and  so  on.  But  the 
results  show  that  when  tests  are  properly  given,  the  entire 
number  of  scores  which  grossly  misrepresent  the  abilities 
of  the  children  is  not  more  than  ten  per  cent,  of  the  total. 

PHYSIOLOGICAL  FACTORS 

Of  the  physiological  factors  affecting  performance, 
little  that  is  based  upon  positive  knowledge  can  be  said. 
There  are  few  studies  of  the  effects  of  lef t-handedness, 
defective  eyesight,  sickness,  accidents,  and  similar  con- 
ditions (which  obviously  must  have  an  influence  upon 
the  functioning  of  brain  and  muscle),  so  that  it  is  im- 
possible to  judge  whether  conditions  in  the  Gary  schools 
are  better  or  worse  than  in  other  school  systems.  In 
general,  any  derangement  or  imperfection  of  the  human 
mechanism  will  affect  the  performance  of  that  individual 
in  a  test,  and  when  that  imperfection  is  a  temporary 
defect,  as  a  sick  headache,  or  a  sore  finger,  the  individual's 
performance  in  a  test  may  grossly  misrepresent  his  true 
ability.1  In  general,  also,  the  number  of  such  special 
conditions  should  be  fairly  constant  from  city  to  city  so 
that  the  effect  of  such  factors  at  Gary  should  not  be 
greater  than  in  other  cities. 

SUMMARY 

In  the  foregoing  pages  the  attempt  has  been  made  to 
array  the  evidence  which  proves  that  a  child's  perform- 

»Scc  page  452. 
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ance  in  a  standard  test  is  determined  by  very  many 
factors,  of  which  ability  is  but  one.  Its  discussions  will 
have  been  in  vain  unless  the  reader  understands  that 
the  meaning,  of  it  all  is  that  the  interpretation  of  the 
results  of  testing  is  a  difficult  matter  at  best,  and  becomes 
quite  impossible  unless  the  nature  of  the  test  itself  and 
the  conditions  under  which  it  is  given  are  known.  The 
thoughtful  reader  will  appreciate  the  reason  for  the  many 
warnings  which  appear  throughout  this  report  against 
careless  comparisons  from  city  to  city. 

On  the  other  hand,  this  chapter  will  have  been  equally 
futile  if  it  succeeds  only  in  arousing  in  the  reader's  mind 
the  suspicion  that  because  of  the  ease  with  which  varia- 
tions in  performance  may  be  brought  about,  standard 
tests  are  unreliable  and,  therefore,  valueless.  It  must 
never  be  forgotten  that  educational  tests  are  as  reliable 
as  a  physician's  thermometer  and  no  more  so.  They 
register  with  absolute  definiteness  and  perfect  exactness 
the  precise  performance  of  each  child  under  the  given 
conditions.  The  writer  has  used  at  Gary  the  standard 
tests  which  seem  to  him  to  be  the  best  available  and  in 
the  foregoing  chapters  of  this  report  and  in  the  appendix 
has  arrayed  as  carefully  and  completely  as  possible  the 
conditions  under  which  they  were  given  and  scored. 
The  reader  should,  therefore,  be  able  to  determine  for 
himself  the  value  of  the  material  herein  presented,  and 
should  not  be  led  into  the  mistake  of  inferring  from  it 
either  too  little  or  too  much. 


DC.    CONCLUSIONS 

rf  THE  foregoing  chapters  the  attempt  has  been  made 
to  present  in  a  strictly  impartial  manner  the  general 
as  well  as  the  detailed  results  obtained  from  the 
measurement  of  the  products  of  classroom  teaching  at 
Gary.  As  far  as  possible  inferences  and  conclusions  have 
been  omitted,  and  statements  limited  to  facts. 

The  person  in  charge  of  an  investigation,  however,  has 
superior  opportunities  to  judge  of  the  reliability  of  the  re- 
sults. He  sees  the  responses  of  children  and  teachers  to 
the  testing  situations  at  the  time  they  are  made.  Week 
after  week,  he  comes  in  contact  with  the  ordinary  routine 
of  the  school  life  under  conditions  less  formal  than  at  the 
times  the  tests  are  given.  His  observation  of  the  work- 
ings of  many  intangible  factors  behind  the  scenes  aid  and 
influence  him  in  interpreting  the  formal  results. 

Herein  lies  a  source  of  danger.  Human  observation 
at  best  is  faulty,  and  inferences  based  upon  it  may  be 
grossly  biased.  The  author  feels,  however,  that  in  spite 
of  this  danger  he  would  be  doing  less  than  his  full  duty 
if  he  did  not  give  his  personal  interpretation  of  the 
results  secured.  Accordingly  in  this  chapter  his  conclu- 
sions are  recorded  for  the  benefit  of  those  who  may  desire 
to  know  them. 

383 
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The  general  firlncinn  of  the  author  is  that  the  prod- 
uct of  classroom  t-arhmg  of  the  fundamentals  is,  at 
Gary,  poor  in  quality  and  inadequate  in  amount;  it 
^mum^t^  in  character  the  product  of  the  poorer 
conventional  schools,  and  reveals  in  no  particular  the 
*Rgkt**t  indication  that  it  has  been  affected  either 
favorably  or  unfavorably  by  the  enriched  curriculum, 
or  other  special  features  of  the  Gary  schools.  The 
progress  from  grade  to  grade  is  relatively  small,  the  final 
levels  of  achievement  reached  are  comparatively  low, 
and  the  differences  between  the  results  of  simple  and 
complex  tests  in  any  one  subject  increase  progressively. 
The  entire  investigation  reveals  many  and  consistent 
evidences  of  careless  work,  imperfectly  developed  habits, 
and  marked  lack  of  achievement. 

The  reader  should,  however,  not  infer  too  much  from 
the  preceding  statement.  In  the  writer's  judgment  the 
results  do  not  mean  at  all  that  the  movement  for  the 
socialization  of  school  work  is  wrong,  that  the  new  type 
of  organization  is  injurious,  and  that  a  modernized 
program  is  a  failure.  They  do  mean  simply  and  solely 
that  none  of  those  features  of  the  Gary  experiment  which 
appeal  so  strongly  to  the  imagination  and  sympathies 
of  the  casual  visitor  have  operated  effectively  enough 
to  offset  the  inadequate  control  of  the  organization  and 
administration  of  the  system  which  the  other  investi- 
gators have  brought  to  light. 

In  other  words,  results  of  tests  reveal  conditions  but 
do  not  show  causes.    While  the  results  prove  plainly 
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that  the  products  of  classroom  teaching  in  the  fundamen- 
tals are  poor,  the  reader  should  not  allow  his  mind  to 
associate  in  a  casual  relationship  the  fundamental  the- 
ories of  a  modernized  school  and  inefficient  teaching. 
Neither  should  he  be  misled  by  "explanations"  of  the  re- 
sults. It  would  be  most  unscientific,  with  an  adequate 
explanation  at  hand,  to  assume  that  the  effects  noted 
are  due  to  the  operation  of  a  remote  and  question- 
able cause.  Present  results  prove  merely  that  at  Gary 
certain  vital  factors  in  school  work  have  been  neglected, 
but  the  reader  should  not  decide  what  those  factors  are 
until  he  has  read  the  other  volumes  of  the  report.1 

The  writer  does  not  believe  that  the  present  defects  of 
administration  are  in  the  slightest  degree  inherent  in  the 
attempt  to  enrich  the  school  curriculum  or  to  modify 
school  practice  so  that  it  will  appeal  to  the  interest  of 
the  child  and  satisfy  the  natural  instincts  of  his  developing 
life.  Therefore,  if  this  report  should  operate  to  retard 
the  progressive  movement  of  which  the  experiment  at 
Gary  is  an  expression,  he  would  feel  that  a  very  great 
and  needless  injury  had  been  done  American  education. 
Although  when  the  investigation  was  undertaken,  it  was 
expected  that  decisive  results  would  be  secured,  it  must 
now  be  emphasized  again  and  again  that  the  effects  of 
the  newer  ideals  of  education  have  not  been  measured, 
because  at  Gary  these  ideals  are  operating  under  such 
conditions  that  they  play  little  or  no  part  in  determining 
the  product  of  classroom  teaching. 

*See  report  on  Organization  and  Administration. 
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On  the  other  hand,  the  benefit  of  this  report  may  be 
great  if  it  makes  clear  that  in  education,  as  in  other  walks 
of  life,  it  is  not  enough  to  take  for  granted  that  because 
aims  are  high,  intentions  good,  and  theory  apparently 
sound,  satisfactory  results  are  sure  to  follow.  The  cour- 
age to  attack  experimentally  great  educational  problems 
cannot  be  too  much  commended,  but  experiments  un- 
measured and  unchecked,  except  by  the  subjective  opin- 
ions of  their  originators,  are  to  be  condemned. 

After  all,  the  message  of  this  report  turns  out  to  be 
not  that  the  Gary  schools  are  good,  bad  or  indifferent 
but  that  by  measurement,  properly  used,  a  superintendent 
who  is  concentrating  his  attention  upon  certain  features 
of  a  constructive  experiment  may  determine  whether 
or  not  he  has  left  unnoticed  certain  other  features  of 
equal  or  greater  importance. 
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Table  II— Continued ' 

The  table  is  to  be  read  as  follows: 

On  March  23,  Thursday,  a  general  test  was  given,  known  as  a  test  of 
rate  of  copying  figures.  This  was  the  first  trial  of  the  test  and  grades  2 
to  12  were  tested.  It  proved  impossible  for  the  examiners  to  reach  all  the 
classes  in  their  academic  classrooms  on  this  day,  however,  so  the  few  re- 
maining classes  were  visited  the  next  day. 

The  table  shows  that  12  different  kinds  of  tests  were  given;  that  some 
of  the  tests  were  given  more  than  once;  that  on  the  average  one  class  in 
the  upper  elementary  grades  was  tested  21  times  in  11  weeks  of  school 
work;  that  in  general  most  of  the  testing  was  done  on  Tuesdays  and 
Thursdays;  that  on  5  days  it  was  necessary  to  visit  certain  classes  on 
Wednesday  instead  of  on  Tuesday,  and  on  Friday  instead  of  on  Thursday; 
that  on  2  occasions  only  were  the  tests  given  mainly  on  any  other  days 
than  Tuesdays  and  Thursdays. 

The  disturbance  of  class  work  by  the  giving  of  a  test  varied  from  10 
minutes  to  40  minutes.  Taking  25  minutes  as  the  average  time  (prob- 
ably an  over-estimate),  and  3  hours  of  school  work  per  day  as  the  total 
time  given  to  academic  studies,  the  interruptions  to  be  charged  against 
the  testing  work  decreased  the  regular  class  time  by  approximately  6 
per  cent,  during  the  testing  period  of  zz  weeks,  or  a  little  less  than  2 
per  cent,  for  the  full  year.  This  was  probably  much  more  than  offset 
by  the  stimulus  of  the  testing  work  to  both  teachers  and  pupils. 
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IIL    Scbool  to  School  Compajosox  for  Handwriting 

The  tables  are  to  be  read  as  follows:  In  the  free 
choice  test  in  handwriting,  the  lowest  class  measured 
in  the  Froebd  school  was  dass  Xo.  32.  This  was  ranked 
as  of  the  4C  grade  in  June,  1916. 

The  generalized  dty  wide  score  in  the  free  choice 
handwriting  test  for  the  4C  grade  was  37  letters  per 
minute  with  a  quality  of  31  Ayres.  The  median  rate 
of  writing  of  dass  Xo.  $2  Froebd  was  43.8  letters  per 
minute  with  a  quality  of  26.3  Ayres.  That  is,  the  rate 
of  writing  of  the  das  was  6.8  letters  per  minute  higher 
than  the  dty  wide  score  and  4.7  points  lower  in  quality. 
As  both  of  these  differences  exceed  .1  of  the  dty  wide 
scores  they  are  starred  (*)  in  the  table. 

In  Froebd  four  classes  had  scores  which  were  markedly 
above  the  dty  wide  scores  in  rate,  and  four  classes  had 
scores  which  were  markedly  bdow  the  dty  wide  scores. 
The  variation  in  the  remaining  classes  was  considered 
to  be  negligible. 
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V.    Sample  Gary  Compositions  Arranged  as  a  Scale 

The  samples  below  constitute  a  composition  scale 
and  are  given  that  the  reader  may  see  the  nature 
of  such  a  scale  when  all  samples  are  of  one  general 
type.  In  the  Hillegas  Scale  the  samples  vary(greatly 
in  style  and  content  and  the  effect  of  this  added 
complexity  is  to  confuse  the  judgment  of  those  who 
have  no  conception  of  the  general  scheme  of  develop- 
ment in  composition .  On  the  other  hand,  the  exact  value 
of  the  samples  in  the  Hillegas  Scale  has  been  determined 
with  great  care,  so  that  it  is  much  better  to  base  judg- 
ment directly  upon  the  Hillegas  Scale,  provided  the  per- 
son who  does  the  scoring  uses  the  scale  only  as  a  means 
of  deciding  the  exact  value  a  given  composition  should 
receive. 

Papers  which  range  in  value  from  o  to  30  units  have 
a  common  characteristic,  they  are  difficult  to  read  and 
understand.  The  errors  in  spelling  and  syntax  are  so 
many  and  so  gross  that  on  first  reading  the  words  "do 
not  make  sense."  One  must  read  and  reread  many  times 
before  the  meaning  is  apparent.  Sometimes  even  then 
one  is  not  sure  what  the  writer  of  the  composition  really 
intended  to  say. 

The  exact  value  to  be  assigned  such  samples  will  be 
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determined  by  the  degree  to  which  the  meaning  can  be 
made  out.  If  no  meaning  is  apparent  after  study,  the 
paper  would,  of  course,  have  zero  value,  but  such  samples 
will  be  found  but  rarely.  Sample  A  is  an  illustration  of 
a  composition  in  which  the  errors  are  so  many  and  so 
gross  that  little  meaning  is  apparent. 

SAMPLE  A 
AN  EXCITING  EXPERIENCE 

I  was  on  a  wonderful  to  no w- Yeark  and  it  waos 
and  the  journey  that  I  lecekt  so  mouch  butit 
the  fun  I  head  on  the  trean  but  the  buge  ried  in 
ealmer  Nou-Yeaork  waos  the  beast  thean.  and 
I  belavef  is  all  that  I  can  thank  on  so  godboy. 

Here  and  there  it  is  possible  to  get  a  glimpse  of  the 
thought  the  child  was  trying  to  express,  and  there  is  a 
certain  persistence  of  general  subject  throughout  the  com- 
position as  a  whole.  Its  value,  therefore,  will  be  some- 
where between  o  and  9.  The  value  assigned  by  the  Gary 
judges  was  5  (average  deviation  of  two  judgments  5). 

Sample  B  represents  a  higher  stage  of  development. 

SAMPLE  B 
THE  BEAER  AND  A  BOY 

Ones  their  wasa  littil  boy.    He  we»  went  in  the 
weth  a  gan  and  a  hachet  and  he  he  saw  a  beaer. 
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The  beaer  begain  to  run  after  the  boy.  And 
their  was  a  oke  tree  he  stared  to  clamb  the  tree. 
The  beaer  avas  after  the  boy  and  the  beaer 
stared  to  clamb  the  tree.  And  the  beaer  dam- 
bed  and  clambed  so  the  boy  thart  he  beter  get 
out  of  their  he  fawed  a  hoi  in  the  so  the  got  down 
in  the  hoi  and  the  beaer  con't  find  the  boy.  So 
the  beaer  went  on  an  lime.  The  bay  clambed 
out  and  chot  the  lime  of  and  the  bearer  wos 
celed. 

There  is  continuity  of  thought,  and  although  the  errors 
are  so  many  and  so  gross  that  it  is  necessary  to  read  and 
reread  the  words  before  one  can  translate  them  into  their 
conventional  form,  still  it  is  possible  to  make  out  the 
general  meaning.  Such  a  composition  will  range  in 
value  from  10  to  19.  The  value  assigned  this  composi- 
tion by  the  Gary  judges  was  15  (average  deviation  of 
two  judgments  5). 

Sample  C  represents  a  still  higher  stage  of  development 

sample  c 

Once  My  nother  want  to  voiter  a  frincd  And 
she  take  me  a  gave  me  to  my  aunt  thats  ny 
mother  want  away  And  my  aunt  was  choping 
wood  Andl  was  playing  biut  a  bucter  Came  and 
ny  aent  want  inside  The  hoube  and  I  want  and 
take  The  hatck  and  was  choping  the  wood 
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with  the  hatchek  and  chop  my  Finger  and  my 
nother  wasnot  Home  so  I  uat  crying  and  Than 
ny  mother  came  after  There  day  and  I  was  over 
ny  aunt  House  I  was  in  bed  and  ny  mother  was 
glad  I  was  Well  and  I  was  happy  again.  And  than 
I  have  a  mark  on  My  frienger  it  is  the  left  hand 
Frienger  it  is  the  2ndfrineger  and  the  Than 
after  my  mother  take  me  Home  and  I  was  a 
happy  girl  After  But  I  was  till  or  ny  frienger 
That  marke  stayed  ny  mother  Was  helfing  my 
friend  and  Now  I  am  happy  girl  as  Happy  as 
can  be  happy. 

There  are  enough  words  correctly  spelled  and  correctly 
arranged  in  the  conventional  sequence  so  that  one  can 
get,  even  on  a  single  reading,  the  general  drift  of  the 
meaning.  However,  even  in  this  paper  one  must  read 
and  reread  parts  of  it  before  he  is  sure  what  thought  the 
writer  intended  to  express.  Papers  of  this  character 
range  in  value  from  20  to  29.  The  value  assigned  this 
paper  by  the  Gary  judges  was  20  (average  deviation  of 
two  judgments  o). 

In  passing  from  samples  of  the  first  type  to  samples 
of  the  second  type  the  emphasis  shifts  from  gross  errors 
in  spelling  and  structure  to  errors  in  organization.  The 
general  characteristic  of  samples  of  this  type  is  that  they 
are  tiresome  to  read. 

The  worst  sample  of  the  second  type  consists  of  mere 
succession  of  sentences  loosely  joined.    Sample  D  is  an 
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illustration  and  the  value  of  such  composition  ranges 
from  30  to  39  Hillegas. 

SAMPLE  D 
IN  THE  MOUNTAINS 

As  we  went  to  spent  our  vacation  I  happen  to 
be  right  near  the  mountains  I  was  glad  couse  I 
could  go  and  climb  just  as  higch  as  I  want  for. 

So  I  went  with  my  father  and  mother  we  went 
pvery  hiegh  it  was  getting  cold  already  why  I 
think  abouve  the  clouds  I  want  to  rich  the  tops 
but  couldn  couse  there  was  ice  and  it  was  so 
sleapry  to  goe  any  further  se  we  came  baak  when 
we  came  down  there  was  many  more  moun- 
tains and  I  disided  to  go  on  some  others  well  & 
ni  went  it  wasnot  very  hiegh  just  like  others 
so  when  nex  were  clinbing  it  little  to  sandy  we 
riched  the  top  alright  but  whenwe  wanted  to 
come  down  why  we  couldn  mole  so  we  sat  down 
and  slide  down  in  that  way  we  couldn  get  down 
in  that  city  where  we  were  its  too  cold  in  sum- 
mer sometimes  its  snowing  but  this  little  city 
was  full  of  trees  and  mountains. 

Note  the  frequent  use  of  the  word  "and"  joining  dis- 
connected thoughts,  also  the  amount  of  irrelevant  ma- 
terial. The  value  of  the  sample  is  determined  by  the 
general   effect  produced   by  disorganization   and   the 
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mechanical  mistakes.  The  value  assigned  this  com- 
position by  the  Gary  judges  was  30  (median  deviation 
of  five  judgments  2,  average  deviation  3.6). 

In  Sample  E  the  gross  errors  are  much  less  in  number 
and  less  serious  in  character.  On  the  other  hand,  the 
amount  of  irrelevant  material  is  large. 

SAMPLE  £ 
AN  EXCITING  EXPERIENCE 

One  day  it  was  very  hot  and  we  didn't  know 
what  do  do. 

This  was  at  our  Gym.  period,  so  we  went 
over  on  the  lawn  and  sat  under  a  tree  for  a  while 
in  the  shade. 

After  a  while  one  of  the  girls  said,  Let's  play 
something. "  We  all  suggested  that  we  would 
play  ghost.  We  started  and  played  for  a  long 
while  and  then  we  got  tired  and  we  sugested  we 
would  play  something  else  so  we  played  leap 
frog. 

We  were  awfully  hot  now  so  we  sat  down  in 
the  shade  and  rested  ourselves. 

After  a  while  one  of  the  girls  sugested  that  we 
would  play  statue.  We  had  been  playing  awhile 
and  it  was  my  turn  to  be  be  swung  around. 
The  girl  that  was  swinging  had  swung  all  the 
other  girls  and  they  were  pretty  heavy  then  she 
took  me  and  swung  me  around  fast  not  thinking 
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of  how  light  I  was  she  let  me  go  and  I  fell  on  my 
left  wrist.  I  heard  it  crack  and  I  thought  it  was 
broke,  they  took  me  down  in  the  Gym  and  the 
Gym  teacher  bandaged  it  up.  It  was  not 
broke,  but  it  was  sprained. 

I  bet  there  were  fifty  girls  that  asked  me 
where  I  fell  what  I  was  doing  an  how  it  was 
done.  Well  that  was  the  last  game  that  I  have 
played  since  then. 

The  value  assigned  this  composition  by  the  Gary  judges 
was  40  (median  deviation  of  five  judgments  o,  average 
deviation  1). 

Sample  F  represents  a  still  higher  degree  of  ability. 

SAMPLE  F 
AN  ACCIDENT 

We  were  out  at  camp  No  133  which  is  sittua- 
ted  in  on  near  the  banks  of  Deep  River.  One  of 
the  men  that  stayed  at  this  camp  owned  a  old 
duck  boat  which  leaked  and  if  you  wanted  to 
ride  in  it  you  would  have  to  set  a  certain  way  ot 
it  would  fill  with  water  and  soon  sink. 

My  brother  saw  me  paddaling  around  in  it 
]and  he  decided  that  he  would  do  it  himself. 

He  weighed  about  twenty-five  lbs.  more  than 
me  I  told  him  the  way  to  set  in  it  but  he  would 
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not  listen  but  said  that  one  end  was  as  good  as 
the  other. 

He  jumped  in  and  sat  down  on  the  nearest  end 
which  was  the  wrong  end  and  paddaled  out  in- 
to the  river.  He  paddaled  down  the  river  for 
some  distance  and  then  turned  around  to  come 
back.  By  this  time  the  boat  was  nearly  sink- 
ing and  we  saw  him  paddeling  as  fast  as  he  could 
go  to  get  back  to  the  bank. 

But  it  was  of  no  use  the  boat  began  to  sink 
and  he  tried  to  get  to  the  right  end  but  in  trying 
to  get  to  the  right  end  he  upset  the  boat  and  had 
to  swim  with  all  of  his  clothes  on.  The  water 
wasn't  very  cold  and  he  swam  all  the  way  to  up 
the  bridge  pushing  the  boat  with  him.  He  soon 
was  in  dry  clothes  and  was  none  the  worse  for  - 
the  accident. 

The  composition  deals  with  a  single  incident  and  there 
are  evidences  of  organization  and  intelligent  selection  of 
material.  On  the  other  hand,  the  errors  in  spelling  and 
structure  are  sufficient  in  number  and  character  to  mar 
the  continuity  of  thought,  and  the  composition  as  a 
whole,  like  others  of  this  type,  is  tiresome  to  read.  The 
value  assigned  this  composition  by  the  Gary  judges  was 
50  (median  deviation  of  five  judgments  o,  average  de- 
viation .2). 

Sample  G  represents  the  best  sample  of  this  type. 
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SAMPLE  G 
ON  THE  WATER 

While  at  a  small  lake  not  far  from  Gary  my 
small  brother  had  quite  an  exciting  adventure. 
The  lake  is  quite  deep  and  though  my  brother 
can  row  a  boat  quite  well  we  never  allowed  him 
to  go  out  in  a  boat  by  himself . 

One  day  my  brother  asked  mother  if  he  might 
go  down  to  the  beach.  She  replied,  "Yes,  but 
do  not  get  into  the  boats."  Clouds  began  to 
gather  and  it  looked  as  though  we  would  have  a 
violent  electric  storm. 

It  began  to  get  dark  and  mother  thought  of 
Bud  so  she  sent  my  sister  to  get  him.  After  a 
few  minutes  she  came  back  saying  that  Bud  was 
not  on  the  beach. 

We  asked  if  any  one  had  seen  him  but  no  one 
had.  I  thought  that  he  might  be  playing  with 
one  of  the  little  boys  so  I  went  to  a  house  on  the 
bluff  expecting  to  find  him  there. 

While  I  was  on  the  porch,  which  over  looks 
the  lake,  it  began  to  thunder  &  lighten  &  soon 
the  rain  came  down  in  torrents.  I  stood  look- 
ing out  over  the  lake  &  and  I  noticed  a  boat 
sway  at  the  other  side  of  the  lake. 

I  ran  to  my  father  and  told  him  what  I  had 
seen.    He  hurried  to  the  beach  &  found  that 
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one  of  the  boats  was  missing.  Then  he  and 
my  sister  took  a  boat  and  in  about  twenty  min- 
utes they  came  back  with  a  rather  frightened 
&  very  much  bedraggled  little  boy. 

He,  my  brother,  showed  great  presence  of 
mind,  for  when  the  storm  began  he  was  not 
frightened  very  much  and  tried  to  reach  the 
shore. 

You  may  be  sure  that  he  did  not  dare  to  go 
out  in  a  boat  alone  after  that.  He  told  us 
that  he  would  learn  to  swim  first. 

In  content,  in  organization,  and  in  choice  of  words,  this 
composition  gives  evidence  of  considerable  power.  The 
errors,  however,  are  enough  to  spoil  the  effect  of  the 
composition  as  a  whole,  and  to  read  many  such  papers 
would  be  a  tiresome  task.  The  value  assigned  this 
composition  by  the  Gary  judges  was  60  (median  devia- 
tion of  five  judgments  8,  average  deviation  6.6). 

In  the  samples  in  the  third  division  the  literary  appeal 
of  the  composition  is  its  chief  characteristic.  Errors  in 
mechanics  and  in  organization  there  may  be,  but  the 
choice  of  subject  matter  and  the  selection  of  words  to 
express  the  writer's  thoughts  are  so  skillful  that  the 
appeal  of  the  story  is  great  enough  to  hold  the  reader's 
interest  and  attention  in  spite  of  such  defects. 

Sample  H,  for  Instance,  was  given  a  value  of  67  by  the 
Gary  judges  (median  deviation  of  five  judgments  x, 
average  deviation  4). 
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SAMPLE  H 
MY  FIRST  MORNING  IN  MEXICO 

We  came  on  a  train  late  at  night  and  had  a 
hard  time  finding  an  American  hotel  in  San 
Louis  Potest  After  finding  one  we  went 
straight  to  bed,  although  my  mother  took  plenty 
of  time  to  lock  and  prop  the  door  closed.  The 
next  morning  I  woke  up  early  and  looked  out 
of  the  window.  The  first  thing  I  saw  was  a 
rickity  old  closed  up  wagon  coming  down  the 
street  drawn  by  a  few  very  small  borros.  This 
strange  looking  object  was  the  morning  street- 
car, the  driver  was  standing  in  front  blowing 
a  small  tin  horn  for  the  people  to  get  out  of  the 
way. 

Some  of  the  Peons  were  getting  breakfast,  the 
mother  sat  on  the  street  baking  tortellias  and 
the  family  seated  in  a  circle  about  her  eating  as 
fast  as  she  could  bake.  The  funniest  thing 
about  the  people  eating  was  that  the  pigs  and 
dogs  ran  about  the  outside  of  the  circle  eating 
the  scraps  that  were  thrown  them. 

Down  the  street  comes  a  man  riding  on  so 
small  a  borro  that  his  feet  touch  the  ground,  he 
is  smocking  a  cigerette  and  lazily  looking  about. 
Behind  him  walks  his  wife  holding  the  baby  and 
hiting  the  borro  her  husband  rides  on  to  make  it 


go.    Bciuh?  23k-  lent  .  imr-  lie 


street  aitsaot  a  nzhnc  bL^k  wjiiwiw^ 
bars  on  them  warca;  musr  lie  Jims  jndk  urns 


like  a  |  wise  Mi  Osc  ^ie  uixe  nf  3t  nnnnTngg 
can  be  sees  the  Hmmr«im.  ■.■»■' -i  jgqg  unfiling 
bat  the  iemi:i  sane  iil 


s  literary  merit  fies  it  lie  Barnes  if  is 
le  vividuesb   of  its    «"***'  \»'  *■■*    riiLil   tie 
spelling  and  uimsuie  exc  nat  y  imff  rurmgh  to 


Sample  I  lepwsqil*  tie  best  ^Krrrnr  imnd  rmrng 
I  the  papers  written  in  tie  <  ui'ijril Vn  tea  at 
aiy. 

SAX7IX  I 

With  a  jar  and  a  somewhat  bosness 
jolt  the  riddy  elevator  came  to  a  stop  on  the 
basement  floor.    The  door  swung  open  and  I 
stepped  out  into  press-room  of  the  Chicago 
Tribune. 

Surrounded  by  a  mass  of  quiwering  steel  I 
was  at  a  loss  to  know  what  to  do.  I  was  sud~ 
denly  confronted  by  a  bearded  man  clothed  in 
ink  smeared  overall  and  jumper.    A  small  tight 
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shop  cap  was  set  jauntily  on  oneside  of  his  head 
and  a  pair  of  steel  grey  eyes  peered  at  me  thru  a 
rather  large  pair  silver  rased  rimmed  glasses. 
He  seemed  to  be  saying  something  to  me  but 
the  battery  of  Hoe  presses  had  eew  control  of 
the  field  and  it  was  only  with  the  greatest  diffi- 
culty that  I  could  hear  what  he  was  trying  to 
tell  me. 

Beckoning  me  with  an  ink  stained  finger  the 
pressman,  for  such  was  the  position  of  this  man, 
piloted  me  around,  under,  and  even  over  masses 
of  quivering  and  roaring  steel  until  we  stopped 
before  a  press  which  stood  two  and  one  half 
stories  high,  a  half  city  block  in  length  and  the 
same  in  width.  The  silent  pressman  paused  for 
a  moment  to  glace  glance  with  pride  at  the  roar- 
ing monsters  when  he  motioned  me  again  and 
mounting  an  iron  stairway  with  a  brass  railing 
we  wore  soon  standing  on  the  top  deck  of  this 
master  press.  Two  and  a  half  stories  below  me 
sixteen  large  rolls  of  spottlessly  white  paper  were 
swiftly  unrolling  into  the  press.  Wheels  within 
wheels  whirled  and  sang  and  its  very  song 
seemed  to  say  "I  am  the  Frank  A  Munsey  the 
worlds  largest  newspaper  press." 


The  author  of  this  composition  was  a  twelfth  grade 
student  whose  specialty  lies  in  the  field  of  journalism. 
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He  is  a  reporter  for  the  school  papers,  and  as  the  Gary 
representative  of  some  of  the  Chicago  dailies  had  had 
much  experience  in  writing.  The  value  assigned  the 
composition  by  the  Gary  judges  was  80.  (Median  de- 
viation of  three  judgments  5,  average  deviation  5.) 


VI.  Compositions  Valued  at  Approximately  50 
Hillegas  in  Different  Surveys 

Reproduced  here  that  the  reader  may  judge  of  the  uni- 
formity of  the  standard  of  different  examiners  using  the 
Hillegas  scale. 

part  of  sample  of  composition  from  hillegas  scale 

VALUE  47.4 


First  De  Quincys  mother 
was  a  beautiful  woman  and 
through  her  DeQuincy  in- 
hereted  much  of  his  genius. 

His  running  away  from 
school  enfluenced  him  much 
as  he  roamed  through  the 
woods,    valleys    and    his 


mind  became  very  medita- 
tive. 

The  greatest  enfluence  of 
De  Quincy's  life  was  the 
opium  habit,  If  it  was  not 
for  this  habit  it  is  doubtful 
whether  we  would  now  be 
reading  his  writings. 


sample  of  composition  from  trabue  modification  of 

hillegas  scale  value,  49.7 

Next  Saturday  I  should     planting  the  corn,  wheat, 


like  to  go  away  and  have  a 
good  time  on  a  farm.  I 
should  like  to  watch  the 
men  plowing  the  fields  and 


and  oats  and  other  things 
planted  on  farms. 

Next  Saturday  I  will  go 
to  the  Pioneer  meeting  if 
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nothing  happens  so  that  I 
cannot  go.  I  should  like 
to  go  swimming  but  it  Is 
not  warm  enough  and  I 
would  catch  a  bad  cold.  I 
should  like  to  go  to  my 
aunts  and  drive  the  horses. 


I  do  not  drive  without  some 
older  person  with  me,  so  I 
cannot  go  very  often. 

I  should  like  to  see  my 
aunts  cat  and  her  kittens 
too.    I  think  I  can,  to. 


part  of  sample  of  composition  from  gary  (eighth 

grade),  value  50 


We  were  out  at  camp  No 
133  which  is  sittuated  in  an 
near  the  banks  of  Deep 
River.  One  of  the  men 
that  stayed  at  this  camp 
owned  a  old  duck  boat 
which  leaked  and  if  you 
wanted  to  ride  in  it  you 
would  have  to  set  a  certain 
way  or  it  would  fill  with 
water  and  soon  sink. 


My  brother  saw  me  pad- 
daling  around  in  it  and  he 
decided  that  he  would  do  it 
himself.  He  weighed  about 
twenty-five  lbs.  more  than 
me.  I  told  him  the  way 
to  set  in  it  but  he  would 
not  listen  but  said  that  one 
end  was  as  good  as  the 
other. 


PART  OF  SAMPLE  OF  COMPOSITION  FROM  SALT  LAKE  CITY 

(grade  7B),  value  47.4 
One  sunny  morning  in     their  way  and  came  to  see 


May  my  five  cousins  who 
were  on  their  way  to  see  the 
fair  at  Frisco  stopped  on 


me.  May  father  gave  me 
twenty  dollars  to  entertain 
them.   I  was  busy  thinking 
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of  the  best  way  to  do  it.  If 
finally  decided  to  go  to  the 
Bingham  Copper  Mines. 
This  was  satisfactory  to  all 
and  taking  along  a  lunch 
we  started  off. 

When  .we  got  there  it 
was  noon  and  everybody 
was  hungry  so  we  opened 
up  the  lunch  and  ate  until 


there  was  not  a  crumb 
left.  Next  we  hired  a 
guide  to  show  us  through 
the  mines  and  what  a  sight 
we  seen.  There  were  walls 
of  dirt  seemingly  covered 
with  the  yellow  mettle. 
Our  guid  showed  us  where 
the  elevators  were  on  which 


part  of  sample  of  composition  from  butte  (eights 

grade),  value  50 


There  are  five  little  child- 
ren that  live  near  us  who 
are  very  poor.  They  sel- 
dom have  any  new  clothes 
and  less  often  any  toyes. 

On  Christmas  and  other 
days  when  we  children  have 
toys  these  children  may  be 
seen  looking  at  us  with 
longing  eyes,  and  Easter 
time  they  even  seem  en- 
vious. 


Well  I  would  first  buy 
each  child  a  pair  of  shoes 
about  three  and  one  half 
dollars.  Then  I  would  buy 
the  girls,  three  of  them, 
new  dresses.  The  boys  new 
suits.  Which  would  cost 
about  thirty  dollars.  Of 
course  the  girls  would  have 
to  have  hats.  I  would  get 
simple  ones  but  pretty. 
Then  the  boys. 
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part  of  sample  of  composition  from  hackensack,  n.  j. 

(eighth  grade),  value  49.4 


One  morning  I  got  up 
about  half-past  six  in  the 
morning,  this  was  in  the 
country  on  my  vacation 
and  I  helped  the  working- 
men  feed  the  chickens, 
give  them  water  and,  fill 
the  hoppers  with  charcoal, 
oyster  shells,  grit,  bran, 
meddlins,  alfalfa  and  some 
other  things  concerning 
chickens. 


Around  noon  times  the 
men  all  got  to-gether  and 
opened  a  small  pond,  and 
let  the  water  run  out  of 
it,  and  then  got  the  ground 
out  of  it  for  fertilizer,  and 
dug  it  out  also  to  make  it 
deeper  and  to  take  out 
all  the  bogs  and  things 
that  were  unnecessary  to 
be  there. 
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VIII.    Determination  of   Critical  Paragraph  in 

Gray's  Oral  Reading  Scale 

One  of  the  advantages  of  Gray's  Scale  is  that  its 
use  gives,  in  effect,  a  series  of  measurements  of  any 
one  individual  from  which  it  is  possible,  to  gain  some 
idea  of  the  reliability  of  the  results.  For  instance, 
the  record  made  by  the  eighth  grade  at  Gary  for  para- 
graph i  was  204  words  per  minute;  paragraph  2,  196; 
paragraph  3,  191;  paragraph  4,  196;  paragraph  5,  186; 
paragraph  6,  180;  paragraph  7,  153;  paragraph  8, 
143;  paragraph  9, 132;  paragraph  10,  102;  paragraph  11, 
108;  and  paragraph  12,  78  (Table  VIII,  page  440).  For 
purposes  of  comparison  the  rate  of  reading  (actually  re- 
corded as  the  number  of  seconds  required  to  read  each 
paragraph)  has  been  transposed  into  the  number  of  words 
read  per  minute.  The  average  number  of  errors  made  in 
each  paragraph  was  also  found.  These  data  are  shown 
graphically  in  Figure  i,  page  441.  In  paragraphs  1  to  4 
the  rate  is  practically  constant  and  the  number  of  errors 
two  or  less.  From  paragraph  6  on,  however,  the  number 
of  words  read  per  minute  falls  off  rapidly  and  the  number 
of  errors  per  paragraph  increases  at  a  corresponding  rate. 

It  is  probable  that  the  curve  based  upon  the  rates  of 
reading  as  given  is  unreliable  because  the  units  of  rate 
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(number  of  words  read  per  minute)  are  not  equal.  In 
paragraph  i,  eighty-five  per  cent,  of  the  words  are  of 
one  syllable,  while  in  paragraph  6  twenty  four  per 
cent,  of  the  words  have  two  or  more  syllables.  Unfor- 
tunately, very  little  is  known  about  the  time  required 
to  read  words  composed  of  many  syllables  as  compared 
with  the  time  required  for  shorter  words.  On  the  basis 
of  such  data  as  are  available,  an  estimate  has  been  made 
of  the  probable  rate  in  terms  of  equal  units  (sound  divi- 
sions) and  this  estimate  is  represented  in  the  figure  by 
the  dotted  line.  A  comparison  of  this  curve  and  the 
curve  for  errors  will  show  that  in  Gary  the  eighth  grade 
children  read  at  what  is  practically  a  uniform  rate  up 
to  paragraph  6,  and  the  number  of  errors  made  in  para- 
graph 6,  while  more  than  in  paragraph  i,  has  barely 
exceeded  the  two  errors  per  paragraph  which  are  per- 
missible under  Gray's  Standard  4.  From  this  point 
on,  however,  the  eighth  grade  children  encounter  diffi- 
culties, for  the  rate  of  reading  rapidly  falls.  The  median 
paragraph  for  the  grade  (see  this  report  Figure  43,  page 
267)  is,  therefore,  probably  two  paragraphs  beyond  the 
type  of  material  the  eighth  grade  Gary  children  are  able 
to  read  easily  and  without  conscious  effort.  Paragraph 
6  is  apparently  the  critical  paragraph  and  probably  rep- 
resents the  most  difficult  material  the  Gary  children 
should  be  judged  able  to  read  satisfactorily.  Compari- 
sons with  results  from  other  cities,  however,  should  be 
made  on  the  basis  of  results  previously  reported. 


440 


APPENDIX  A 


w 

3 


0 
o 

9 

i 

CO 

>* 

3 
o 

* 
O 


C 


B 

< 

w 

o 

(A 

W 

to 

o 

a 


o 

a 

9 

O 

B 

-< 


J3 


Oi 


00 


kO 


CO 


CM 


3 

o 


CO 

o 


5; 


s 


a 


CI 


00 


00 
kO 


00  i-l 

o    • 

1--KO 


9  s 

^         1-H 


§8 


CM  00 
CO  • 
i-HCO 


3"- 


1-HC0 


i-l  CM 


COIO 

CO     • 


^^ 


S    SZi 


S    £5 


to     S00 


APPENDIX  A  441 

Figure  i 
Rate  of  Reading  and  Number  of  Errors 
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The  diagram  shows  rate  of  oral  reading  and  number  of  errors  made 
by  eighth  grade  children  when  measured  by  the  Gray  Oral  Reading  Scale. 

The  scale  along  the  base  of  the  figure  represents  the  paragraphs  in 
Gray's  Scale.  The  scale  along  the  vertical  axis  marked  "Rate"  repre- 
sents the  number  of  words  read  per  minute.  The  scale  along  the  vertical 
axis  marked  "Errors"  represents  the  number  of  errors  made  per  para- 
graph. The  curves  represent  eighth  grade  median  scores  for  52  chil- 
dren up  to  and  including  paragraph  7.  Only  50  children  are  represented 
in  the  scores  for  paragraph  8,  and  in  similar  fashion  the  number  of 
children  reading  each  paragraph  declines  to  27  for  paragraph  12.  This 
variation  in  the  number  of  children  reading  the  different  paragraphs 
accounts  for  some  of  the  irregularities  in  the  curves. 

The  heavy  solid  line  represents  the  curve  for  rate  of  reading.  The 
heavy  broken  line  represents  the  number  of  errors  made  per  paragraph. 

It  will  be  noted  that  the  rate  of  reading  decreases  so  slowly  from  para- 
graphs 1  to  6  that  the  rate  is  practically  constant,  but  paragraphs  7,  8, 
and  9  and  the  succeeding  paragraphs  are  read  much  more  slowly.  As  the 
rate  decreases,  the  number  of  errors  increase. 
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The  curve  for  rate  of  reading  is  based  upon  the  number  of  words  ii- 
of  the  length.    Each  paragraph  of  the  scale,  however,  con- 
of  a  huger  lwwnhrr  of  syllables  than  the  preceding  para- 
It  is  probable  that  could  the  rate  of  oral  reading  be  measured  in 
of  a  *— g— »  unit,,  the  curve  would  approximate  the  dotted  line 


From  the  curves  the  inlrifim  is  drawn  that  paragraph  6  is  a  critical 
paiagiaph  for  the  Gary  children.  Up  to  and  including  paragraph  6, 
the  children  read  at  a  rate  whkh  is  n^drtmm 

the  maulw  i  of  *«**^  per  paragraph  is  approximately  two.    From 
this  point  on,  however,  the  change  in  rate  and  in  the  number  of  errors  is 


IX.    The  Method  of  Scoring  Reading  and 

Reproduction  Tests 

There  are  no  well  established  methods  of  scoring  the 
attempts  of  children  to  reproduce  a  story  which  they 
have  read  silently.  Accordingly,  the  method  followed  in 
the  Gary  survey  will  be  explained  in  detail. 

Part  of  one  of  the  stories  used  for  the  Reproduction 
Test  is  shown  in  Figure  2,  page  446.  This  was  analyzed 
by  giving  each  main  thought  a  key  letter.  Each  element 
in  the  idea  was  called  a  point  and  given  a  number.  The 
general  scheme  is  also  illustrated  in  Figure  2. 

No  two  persons  would  analyze  the  material  in  the  same 
way,  and  it  is  even  difficult  for  one  person  to  be  consistent 
throughout.  However,  the  advantage  of  this  form  of 
analysis  is  that  it  enables  the  examiner  to  trace  the  action 
of  memory  in  the  process  of  reproduction.  The  analysis 
of  each  paper  shows  plainly  which  elements  were  recalled 
and  in  what  order.  The  reader  should  note  that,  as 
analyzed,  the  paragraph  yields  8  ideas  and  43  points. 

The  papers  written  by  the  children  were  scored  by 
means  of  this  key  and  marked  for  quantity  and  quality 
of  reproduction.  The  number  of  units  reproduced  was 
taken  as  the  quantity  score.  As  the  test  was  used  pri- 
marily for  the  purpose  of  determining  the  rate  of  reproi 
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duction,  the  children  were  stopped  at  the  end  of  exactly 
three  minutes.  This  fact  was  taken  into  consideration 
in  determining  the  accuracy  scores.  The  quality,  or 
accuracy,  of  reproduction  was  taken  as  the  ratio  between 
the  points  exactly  reproduced  and  the  points  in  the  orig- 
inal from  the  beginning  up  to  the  most  advanced  point 
found  in  the  reproduction. 

For  instance,  in  the  poorest  eighth  grade  paper  found 
(Fig.  3,  pages  447-8)  in  three  minutes  a  boy  wrote  but  16 
words,  and  reproduced  but  3  of  the  22  main  ideas  in  the 
story,  down  to  the  word  "pepper."  Therefore,  his  quan- 
tity score  is  3.  He  reproduced  but  8  points  out  of  116 
points  in  the  original  and  one  additional  point  was  trans- 
formed ("  was  late  "  for  "  came  down  late  ") .  His  quality 
score,  or  accuracy,  is  tIv,  or  seven  per  cent. 

Of  the  two  best  eighth  grade  papers  one  is  a  paper 
which  reproduces  a  large  number  of  main  ideas,  the  other 
a  paper  which  reproduces  most  exactly  the  material 
read  (Figure  4,  pages  448-9).  The  first  child  wrote  83 
words  in  three  minutes,  the  second,  62.  In  the  83  words 
were  13  ideas,  yet  the  accuracy  of  reproduction,  as  meas- 
ured by  the  points  exactly  reproduced,  was  but  33  per 
cent.  The  second  paper,  however,  had  an  accuracy  of  70 
per  cent.  In  this  case  the  paper  written  by  the  child  is 
very  closely  an  exact  reproduction  of  the  original  story. 

A  third  type  (Figure  5,  pages  449-50)  is  the  paper  which 
represents  most  nearly  the  median  scores  for  the  eighth 
grade  (eight  ideas  reproduced  with  an  accuracy  of  33  per 
cent.). 
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One  other  illustration  will  be  discussed  (Figure  6,  pages 
450-51),  a  sample  which  makes  it  very  evident  that  the 
quality  of  the  reproduction  is  determined  by  something 
more  than  mere  comprehension  of  the  meaning  of  the 
passage  read.  In  this  case  the  boy  reproduced  3  out  of 
the  five  units  correctly,  and  out  of  the  15  points  which 
occur  in  the  reproduction  test  but  4  are  given  exactly 
as  they  appear  in  the  original.  Two  of  the  points  are 
additions  and  the  rest  are  transposed  or  transformed  in 
some  way.  His  scores  are  3  units  and  15  per  cent, 
accuracy.  The  reproduction  is  much  below  the  average 
of  the  class,  but  the  cause  is  not  failure  to  read  or  failure 
to  understand.  His  rate  of  reading  is  136  words  per 
minute1  (median  for  his  grade,  204),  his  average  scores 
in  the  Kansas  Reading  Test  (average  of  scores  in  Tests 
I,  II,  and  III)  is  15.3  points  attempted  (corresponding 
grade  score,  23.9),  but  his  accuracy  of  reading  in  the 
Kansas  Test  is  91  per  cent,  as  compared  with  83  per 
cent,  for  his  grade.  In  other  words,  he  is  a  slow  but 
careful  reader  who  comprehends  what  he  reads  better 
than  the  average.  His  score  in  composition  was  39  Hille- 
gas  (grade  45.8),  and  he  has  low  scores  in  spelling  and  in 
the  Trabue  Tests. 

In  view  of  all  these  facts,  therefore,  it  would  seem  safer 
to  infer  limited  capacity  and  special  difficulty  in  all 
language  work  as  an  explanation  rather  than  lack  of 
comprehension  in  reading. 

While  conclusions  may  not  be  based  on  a  single  case, 

lTest  II,  136;  Test  III,  123;  Test  IV,  134. 
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even  a  single  exception  may  show  the  existence  of  a 
problem  in  need  of  careful  study.  Any  one  scoring  a 
set  of  papers  from  a  reproduction  test  will  find  plenty 
of  evidence  that  ability  in  composition  enters  so  largely 
into  reproduction  that  until  a  suitable  investigation  has 
been  made  it  is  unwise,  to  say  the  least,  to  judge  of  com- 
prehension in  reading  by  means  of  reproduction.  Ac- 
cordingly, while  the  general  records  for  Gary  given 
above  would  appear  to  represent  a  poor  quality  of  work 
in  the  reproduction  test,  the  author  infers  that  the  cause 
of  the  low  scores  is  to  be  sought  more  in  the  difficulties 
of  the  children  in  reading,  spelling,  and  the  mechanics  of 
English  composition  than  in  mere  failure  to  comprehend 
what  is  read. 

Figure  2 
Analysis  of  Sample  Eighth  Grade  Paragraph  of  the  Reading 

and  Reproduction  Test 

ORIGINAL  PARAGRAPH 

Fred  came  down  late  to  breakfast  one  morning,  so 
late  that  all  the  other  members  of  the  family  were 
through,  and  had  gone  about  their  respective  duties. 
But  though  he  had  slept  late,  Fred  was  still  sleepy; 
for  he  had  staid  up  until  twelve  o'clock  the  night  before, 
whereas  he  was  usually  in  bed  by  nine.  To  tell  the  truth, 
he  was  also  rather  cross, — as  most  boys  are  apt  to  be 
when  they  are  sleepy, — and  as  he  took  his  seat,  he  said; 
"Pshaw"  there's  nothing  on  the  breakfast  table." 
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i  POINTS 


KEY 

VAIN  SUBDIVISIONS 

IDEAS 

1  A      Fred  came  down  late  to  breakfast  one  morning  6 

12       3       4  5  6 

2  B      So  late   that  all  the  other  members  of  the  family 

1  2  3 

were  through  and  had  gone  about  their  respective 

4  5  6 

duties.  7 

7 

3  C     But  though  he  had  slept  late,  Fred  was  still  sleepy       4 

1  2  3       4 

4  D     For  he  had  staid  up  until  twelve  o'clock  the  night  before       4 

12  3  4 

5  £     Whereas  he  was  usually  ki  bed  by  nine  5 

12  3  4  5 

6  F     To  tell  the  truth,  he  was  also  rather  cross  6 

I  2  3       4         5 

7  G     As  cross  as  most  boys  are  apt  to  be  when  they  are  sleepy.        5 

12       3  4  5 

8  H     And  as  he  took  rps  seat,  he  said;    "Pshaw  there's 

1  2      3  4  5 

nothing  on  the  breakfast  table.  7 

6  7  — 

43 

In  the  upper  half  of  the  figure  the  paragraph  is  given  as  it  appeared  in 
the  Test  In  the  lower  half  of  the  figure  it  is  given  as  it  appeared  in  the 
key  used  in  scoring.  Each  division  of  the  paragraph  considered  to 
represent  a  distinct  major  thought  or  idea  is  given  a  key  letter,  and  each 
element  in  the  idea  is  called  a  point  and  given  a  number.  The  impor- 
tant connectives  and  unusual  modifiers  are  considered  as  separate 
elements.  The  words  underlined  are  considered  single  points.,  As  a 
whole,  the  paragraph  yields  8  ideas  and  43  points. 

Figure  3 
The  Poorest  Eighth  Grade  Reproduction 

REPRODUCTION 

Fred  was  late  for  breakfast  one  morning 
there  was  nothing  on  the  table  except  some  pepper 
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Analysis 

Total  ideas  in  portion  of  original  passage  reproduced — 22.  Total 
points — 116.  Ideas  reproduced — 3.  Prints  exactly  reproduced— fr 
Quantity  score — 3.    Accuracy  score — 7  per  cent. 

Record    A  Aa     A     A     HHHV 
l(£/4     6     6      5G73 

The  line  is  drawn  around  A  to  show  that  the  idea,  while  reproduced, 

has  been  transformed  or  altered  in  the  reproduction.     For  meaning  of 
the  points  represented  by  A,  A.  etc.,  see  key  Figure  2. 

Figure  4 
The  Best  Two  Eighth  Grade  Reproductions 

FIRST  PAPER 
REPRODUCTION 

Fred  got  up  very  late  one  morning  and  was  very  cross 
and  still  sleepy  he  was  up  until  twelve  o'clock  the  night 
before  when  he  generally  goes  to  sleep  at  nine.  He  went 
to  the  table  and  all  the  rest  of  the  family  was  through 
and  he  called  kate  but  Kate  did  not  hear  as  she  was 
feeding  the  poultry  in  the  back  yard. 

He  then  said  lazily  nothing  on  the  table  and  he  put 
his  head  down  and  pretty  soon  he  began  to 

*~*§i  l(f}S  4°  ?§?  4°  f (§6 J® 

/8\B  Bill  /£)k  k  l  l  l/!\l 

(2/3    4    12    4U/4    5    1    2    S[2)&\ 


H  H  I    H  H  H/MftflO/fn  P 
2    3    3    5    6    7  Ujfey  2  ll/  2 


The  record  on  this  page  represents  the  largest  amount  (number  of  main 
ideas)  reproduced,  and  the  record  on  page  449  represents  the  greatest 
accuracy  of  reproduction. 


APPENDIX  A  449 

ANALYSIS 

Total  ideas  in  portion  of  original  passage  reproduced — 16.  Total 
points — 79.  Ideas  reproduced — 13.  Points  exactly  reproduced — 28. 
Quantity  score — 13.    Accuracy  score — 35. 

SECOND  PAPER 
REPRODUCTION 

Fred  came  down  to  the  breakfast  table  late,  All 
the  other  members  of  the  family  had  finished  and  were 
doing  their  respective  duties.  Although  Fred  had  slept 
long  he  was  still  sleepy,  for  he  had  gone  to  bed  at  twelve 
o'clock  the  night  before,  whereas  he  was  usually  in  bed 
at  nine.    To  tell  the  truth  he  was  also  cross,  as  boys 

Record    A   A   A   A©A   B   B 
12    3    5         4    2    3 

d/8)d  DEEEEEFFFFGG 
1/273  412345123612 

ANALYSIS 

Total  ideas  in  portion  of  original  passage  reproduced — 7.  Total  points 
— 36.  Ideas  reproduced — 7.  Points  exactly  reproduced — 25*  Quantity 
score — 7.    Accuracy  score — 70. 

Figure  5 
A  Median  Eighth  Grade  Reproduction 

REPRODUCTION 

Fred  came  downstairs  one  day  for  breakfast  but  he 
was  late  for  he  had  just  gone  to  bed  at  twelve  the  other 
night  and  could  not  get  up  in  time  and  when  he  came  to 
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the  table  he  was  lazy  and  said  "Theres  nothing  on  the 
table  "  Then  he  sat  down  and  began  to  call  out "  Kate" 
"Kate"  but  Kate  never  came  because  she  was  feeding 
the  chickens  in  the  yard.  Then  all  at  once  he  began  to 
sneeze. 

Record  A  A  A  (xV^^^  A   d/9|D  D® 
1    2    3       VfiAfi/       4    lU  3    4 

<&/K\  A    ©   I    H  H  H  H/S\I  (5<)i 
U  2  3    3    6    6    7ij/2         4 


I    J/J    JHLLLLO  O/PIP 
S   ni    3jl    2    6    7    5   2    3V1/2 


P 
3 


The  line  is  drawn  around  A  to  show  that  the  idea,  while  reproduced, 

has  been  transformed  or  altered  in  the  reproduction.    For  meaning  of 

points  represented  by  A.  A,  etc.,  see  key,  Figure  2,  page  446.    The  symbol 

1    2 

X  represents  addition  or  ideas  not  found  in  the  test. 

ANALYSIS 

Total  ideas  in  portion  of  original  passage  reproduced — 16.  Total 
points — 79.  Ideas  reproduced — 8.  Points  exactly  reproduced — 26. 
Quantity  score — 8.    Accuracy  score — 33. 

Figure  6 
Reproduction  by  Pupil  of  Little  Ability 

REPRODUCTION 

Fred  was  a  boy  how  who  had  came  down  from  his 
bed  room  lat  for  dinner  and  was  lat  because  he  had  been 
out  untill  twelve  that  night  when  his  usal  bedtime  was. 
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Record  A  @/^  A   $   A 
1         Kg/  t  4 


The  line  b  drawn  around  A  to  show  that  the  idea,  while  reproduced, 

has  been  transformed  or  altered  in  the  reproduction.    For  meaning  of 
points  represented  by  At  A.  etc,  see  key.  Figure  2,  page  446.     The 

symbol  X  represents  addition  or  ideas  notfoond  in  the  test. 


Total  ideas  in  portion  of  original  passage  reproduced — 5.  Total 
points — 26.  Ideas  reproduced — 3.  Points  exactly  reproduced — 4. 
Quantity  score— 3.    Accuracy  score — 1 5  per  cent. 


X.    Variability  and  Its  Significance 

To  provide  an  objective  illustration  of  variation  in 
individual  scores,  the  performances  of  the  members 
of  a  certain  eighth  grade  class  in  a  very  simple  test 
will  be  discussed.  A  test  in  copying  figures1  was  given 
five  times.  It  involves  a  minimum  of  thought  ele- 
ments and  measures  simply  the  writing  reaction  time. 
The  score  is  the  number  of  figures  written  per  minute 
without  regard  to  quality.  The  starting  and  stopping 
signals  were  given  by  means  of  an  automatic  timing 
device.  The  tests  were  considered  by  the  children  as  a 
game,  and  under  practice  their  scores  rose  rapidly. 

The  median  score  of  the  group  of  22  eighth  grade 
children  was  100  figures  on  the  initial  test  (Tuesday). 
On  the  second  testing  day  (Thursday),  when  three 
trials  were  given  in  succession,  the  first  trial  resulted 
in  a  score  of  103.5  a  g^  °f  3-5  figures;  the  second 
trial  in  in  figures,  a  further  gain  of  7.5  figures;  and  the 
third  trial  in  115  figures,  a  further  gain  of  4  figures. 
When  the  test  was  given  for  the  fifth  time  (on  the  follow- 
ing Tuesday)  there  was  no  further  improvement,  the 
score  being  115  figures  (Table  DC,  page  454), 

The  scores  of  individual  children,  however,  vary  in 

2Takcn  from  Courtis  Standard  Research  Tests,  Series  A,  Test  $• 

452 
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quite  a  different  fashion.  Two  children  improved  twenty 
figures  between  the  first  and  second  trials.  The  score 
of  another  child  declined  ten  figures.  A  third  child 
made  his  highest  score  on  the  third  trial,  while  still 
another  made  precisely  the  same  score  in  three  trials, 
a  slightly  higher  score  in  the  fourth  trial  and  a  lower 
score  in  the  fifth  trial. 

The  variations  in  the  performances  of  individual  mem- 
bers of  a  class  may  be  accounted  for  in  different  ways. 
When  taking  the  first  test  some  of  them  proceeded  cau- 
tiously and  suspiciously.  They  eyed  closely  the  stranger 
who  gave  the  test  and  devoted  a  good  deal  of  their  atten- 
tion to  the  timing  device.  They  had  various  thoughts 
and  emotional  reactions  to  which  their  conscious  atten- 
tion was  mainly  given.  But  when  they  had  taken  one 
test  and  knew  what  to  expect  they  began  to  put  forth 
effort.  They  tried  to  excel  their  own  scores  and  those 
of  their  neighbors.  The  competitive  spirit  developed 
and  their  scores  rose. 

For  others,  taking  the  test  was  itself  a  species  of  train- 
ing. Their  abilities  developed  because  their  initial 
abilities  were,  for  the  most  part,  far  below  their  capaci- 
ties. Some  of  the  children  became  more  skillful  in 
turning  over  their  papers  quickly  and  getting  to  work, 
others  learned  to  hold  their  attention  rigidly  to  the  task, 
others  made  their  figures  less  carefully,  and  others  in 
still  different  ways  succeeded  in  writing  a  larger  number 
of  figures  with  each  trial. 

On  the  other  hand,  factors  of  quite  a  different  character 
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produced  variations  in  the  opposite  direction.  A  boy 
trying  too  eagerly  used  force  enough  to  break  his  pencil 
point  and  lost  time  finding  another  pencil.  A  girl 
stopped  in  the  middle  of  a  test  to  pin  up  a  curl  that 
bothered  her  by  swinging  before  her  eyes.  A  boy  sneezed 
and  took  out  his  handkerchief.  After  one  or  two  trials 
a  girl  who  was  slow  became  discouraged  and  thereafter 
wrote  listlessly  without  real  effort. 

The  performances  of  the  children  in  the  class  cover  a 
zone  about  fifty  five  figures  wide.  More  than  half  the 
children  have  a  difference  between  the  highest  and  lowest 
score  made  of  twenty  figures  or  more,  while  the  corre- 
sponding difference  from  the  class  as  a  whole  is  but  fifteen 
figures.  The  pupil  who  had  the  most  constant  results 
(No.  1 6)  has  a  variation  between  highest  and  lowest 
scores  of  ten  figures.  It  should  be  evident,  therefore, 
that  ability  to  copy  figures  cannot  be  determined  for 
the  individual  by  a  single  test;  that  the  performances 
of  both  the  individual  and  of  the  class  change  from  test 
to  test  under  the  play  of  many  forces;  that  the  ability 
of  the  individual  can  be  inferred  with  certainty  only 
from  a  whole  series  of  tests  which  make  evident  the 
conditions  under  which  variations  take  place.  For  the 
class,  however,  the  single  test  is  much  more  significant, 
for  the  group  score  and  the  group  variations  do  not 
change  greatly  in  subsequent  tests.  The  coefficients  of 
correspondence1  in  the  five  trials  of  the  copying  of 
figures  test,  based  upon  the  individual  scores  of  the 

*For  an  explanation  of  this  measure  of  relationship  see  p.  475. 
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42  eighth  grade  children  present  in  all  tests,  bring 
out  the  effect  of  these  chance  variations  in  destroying 
the  apparent  correlations  (Table  X),  The  scores  in 
Test  I,  for  instance,  bear  about  the  same  relation  to  the 
scores  in  Test  IV  that  the  results  from  any  two  educa- 
tional tests  would  bear  (50  per  cent.)-  On  the  other 
hand,  there  is  almost  perfect  correspondence  (98  per 
cent )  between  the  scores  of  the  children  as  determined 
by  the  median  of  five  trials,  and  the  scores  on  the  second 
trial.  After  the  second  trial  the  factors  of  turning 
papers,  making  poorer  figures,  etc.,  begin  to  distort  the 

TABLE  X 
Comparative  Results  of  Different  Trials  in  Copying  Figures1 


TEST  I 

MEDIAN 

MEDIAN  DEVIATION 

TOTAL  RANGE 

Trial      I 

100 

10 

46-121 

II 

110 

10 

60-135 

"    III 

115 

10 

62-145 

"       IV 

117 

8 

66-145 

V 

117.5 

7.5 

60-140 

W.  S.« 

114 

11 

60-135 

Percentages  of  Total  Cases  Which  Do  Not  Vary  in  Relative 
Position  More  Than  One  Unit  of  Variability 


TRIALS 

I 

11 

ni 

rv 

V 

w.s. 

I 

^^_ 

76 

74 

50 

57 

69 

II 

76 

— 

90 

81 

74 

98 

III 

74 

90     ' 

— 

83 

71 

93 

IV 

50 

81 

3 

— 

83 

83 

V 

57 

74 

71 

83 

— 

91 

W.S.* 

69 

98 

93 

83 

91 

^— 

*Based  on  the  scores  of  as  eighth  grade  pupils. 

>W.  S.  means  weighted  score,  the  score  taken  u  most  representative  of  each  individual's 
ability. 
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scores  so  that  they  no  longer  represent  the  thing  the 
tests  are  designed  to  measure. 

The  scores  of  this  group  of  children  illustrate  another 
important  and  exceedingly  interesting  point.  The 
standard  deviation  (based  on  the  class  median)  for  the 
scores  representing  the  median  performances  of  each 
individual  (n.8)  is  smaller  than  that  for  any  single  trial 
o*  the  test,  except  the  third,  which  has  the  same  value. 
In  other  words,  the  effect  of  the  variation  in  individual 
scores  is  to  increase  the  apparent  range  of  variation 
within  the  class.  On  the  basis  of  the  results  of  a  single 
test,  individuals  within  classes  appear  to  differ  more  in 
ability  than  they  really  do. 

For  instance,  a  group  of  22  fifth  grade  children  was 
selected,  each  member  of  which  had  the  same  median 
score  for  the  five  trials  (95).  That  is,  they  represented  a 
group  of  children  of  equal  ability  in  copying  figures  as 
far  as  that  ability  can  be  determined  by  five  trials  of  the 
test  (Table  XI,  page  458,  Figure  7,  page  460).  The 
real  variability  of  the  group  is  zero,  yet  two  individ- 
uals (Nos.  1  and  2)  on  the  first  trial  had  very  low 
scores  which  totally  misrepresent  their  true  abilities; 
two  other  individuals  (Nos.  15  and  19)  had  very  high 
scores  in  the  fifth  trial,  which  similarly  misrepre- 
sent   their    true    abilities.1      Similar    extreme    scores 

1These  last  two  are  from  the  same  class  and  it  might  be  thought  that 
in  this  class  an  error  in  timing  had  been  made.  However,'  an  in- 
spection of  the  class  medians  and  class  distributions  shows  that  they 
are  precisely  similar  to  those  of  other  classes  of  the  same  grade,  and 
that  these  two  individual  scores  are  simply  erratic  variations. 
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Fictm  7 
Variation  m  Pebfobmances  or  Chxldbem  of  Equal  Anun 

RANGE   Or   VARIATION    IN  fCftrORfUNtf     COPYING   F1CURCS 


Jii 


llfffl 


Ife^Clg 


TOTAL  SCORES     MO-  iOO£ 

WITHIN  ±10  31  •    4t™* 


The  scale  along  the  felt  hand  vertical  axis  represents  lie  nuniberof 
figures  copied  per  minute.  The  squares  represent  individual  children. 
The  arrows  show  thn  range  of  variation  from  the  lowest  to  the  highest 
score;  thus,  the  lowest  score  made  by  individual  I  was  as  figures  per 
minute  on  the  first  trial,  the  highest  score  was  no  figures  per  minute  oo 
the  fourth  trial ;  the  point  of  lie  arrow  in  each  case  showing  the  lowest  or 
highest  score,  and  the  figure  just  beyond  the  arrow  showing  the  number 
of  tie  trial  in  which  the  score  was  made. 

Individuals  Nos  2,  3,  15,  and  19  show  great  variation.  Individuals 
Nos.  16,  17,  18, 10,  11,  and  22  show  very  small  variation.  The  at  indi- 
viduals had  1 10  scores  in  lie  five  trials;  35  of  these  scores  were  95  figures 
per  minute,  51  other  scores  were  within  10  points  above  or  below  95. 
That  is,  SS  per  cent,  of  the  scores  fell  within  10  points  of  95.  Twenty 
four  scores,  or  21  per  cent,  of  the  total,  show  a  variation  ""■"li"ii  10 
points  from  95. 
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occur  in  Test  I,  in  Test  II,  and  in  each  of  the  other  tests. 
Only  seven  children  (Nos.  14,  16,  17,  18,  20,  21  and  22) 
show  a  total  range  of  less  than  20  figures,  yet  88  per  cent, 
of  all  the  scores  fall  within  ten  figures  of  95.  In  general, 
only  very  seldom  would  a  test  be  given  to  a  group  of  20 
or  more  children  without  securing  scores  which  would 
misrepresent  grossly  the  abilities  of  at  least  2  children. 

One  phase  of  the  question  still  remains  to  be  discussed 
— the  relation  of  variation  in  performance  to  the  measure- 
ment of  individual  ability.  All  that  has  been  said  so 
far  has  tended  to  emphasize  the  unreliability  of  individual 
performance  in  a  single  test,  and  the  reader  might  easily 
get  the  idea  that  tests  were  utterly  worthless.  The  truth 
of  the  matter  is  that  for  eighty  to  ninety  per  cent,  of  the 
children  a  single  test  yields  a  fairly  reliable  measure. 
For  half  the  children  the  results  are  perfectly  satisfactory. 
For  any  group  taken  as  a  whole  the  chance  variations 
offset  each  other,  so  that  class  scores  may  be  depended 
upon.  Generalized  scores  based  upon  city  wide  dis- 
tributions of  a  large  number  of  cases  reveal  with  absolute 
certainty  the  general  level  of  achievement  under  the 
given  conditions. 

Many  persons  seem  disinclined  to  accept  general 
results  as  valid  when  they  know  that  some  of  the  individ- 
ual results  are  unreliable.  It  is  easy  to  show,  however, 
that  in  general  the  individual  results  are  reliable  and 
that  such  variations  as  occur  fall  within  certain  general 
limits.  That  is,  an  individual  tends  to  maintain  his 
general  level  in  the  group,  so  that  when  the  results  of  a 
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mm&er  of  tfiScteut  tests  are  available  his  general  ability 
.  Of  course  allowance  must  be  made  for  the 
in  capacity  which  make  for  personality.  One 
cbuc  csoefe  in  mathematics  and  is  poor  in  English,  one 
2niT  eve&  be  strong  in  addition  and  weak  in  subtraction. 
Ban  it  geaenL  the  consistency  of  an  individual's  scores 
is  luae  eviient  by  any  comparison  of  the  results  of  an 
eranAc  aeries  ot  tests. 

Wbea  the  variability  ratios1  of  any  individual  are 
*v*i*iie  ia  many  dmeient  tests  the  average  ratio  may 
be  £fcbe*  as  a  im>uie  of  the  individual's  general  ability. 
Oa  tie  assszrpdoc  probably  unwarranted)  that  all  the 
ikf  vidsiis  cxrpaxtd  have  had  equal  opportunities  for 
tniaag.  this  avenge  ratio  would  represent  a  measure 
ct  relative  capacity,  Soch  average  ratios  were  obtained 
ior  tie  4^  e^ith  grade  children  present  in  all  tests. 

Ike  xario  of  the  individual  having  the  highest  average 
n:»  is  oaosKsteathr  from  o  to  2  units  above  the  median 
CTabie  XIL  page  4S2,  Figure  8,  page  467).  In  but  fif- 
teen per  cent,  of  the  cases  does  the  ratio  fall  below  the 
EM&m  and  these  are  offset  by  an  equal  number  of  excep- 
tionally lugh  saxes.  In  most  of  the  tests,  the  child's 
scores  pboe  hb  in  the  group  in  his  correct  position  as 
exactly  as  ^general  abffityw  has  a  constant  value. 

Smihriy,  the  results  for  other  children  show  equal 
cooststency.  The  perioonances  of  the  less  able  individ- 
uals tend  to  be  more  variable  and,  therefore,  less  reliable, 
but  in  from  sixty  to  eighty  per  cent,  of  the  cases  a  single 

'For  aa  eipbaa&on  of  ibis  measure  of  relative  position,  see  p.  475. 
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Figure  8 — Continued 

The  scales  along  the  top  of  the  figure  show  relative  position  within  the 
group.  M — median.  The  unit  is  the  median  deviation  from  the 
median.  The  figures  along  the  scale  at  the  left  of  the  different  figures 
represent  the  various  tests  in  Table  XII,  page  462.  The  dotted 
lines  represent  the  actual  scores  of  3  individuals  in  the  group  of  42  eighth 
grade  children  present  for  all  tests.  A  represents  the  individual  having 
the  highest  average  score,  B  the  most  consistently  median  individual, 
and  C  the  individual  having  the  lowest  average  score.  The  solid  lines 
are  based  upon  the  average  values  for  the  different  subjects. 

The  curves  make  evident  the  extreme  range  of  variation  in  perform- 
ance in  any  one  test,  and  the  reliability  of  scores  based  upon  many 
tests.  For  instance,  individual  A  ranges  from  nearly  the  lowest  position 
in  the  class  to  the  highest  in  single  tests,  yet  in  each  subject  his  average 
position  is  closely  the  same  as  for  alL  The  same  is  true  of  B,  and  to  a 
lesser  extent,  of  C  The  curves  show  the  individual  differences  of 
children.  Thus  relatively,  B  is  weak  in  writing,  strong  in  spelling,  weak 
in  arithmetic  and  composition,  and  strong  in  reading.  The  graph  also 
tends  to  show  that  the  less  able  individuals  exhibit  greater  fluctuations  in 
performance  than  the  more  able. 

test  affords  a  fairly  accurate  indication  of  the  ability  of 
the  individual  (within  one  unit  of  variability).  The  re- 
liability of  group  results  is  due  both  to  the  fact  that  the 
individual  results  are  in  general  reliable,  and  to  the 
additional  fact  that  such  variations  as  occur  are  as 
likely  to  take  place  in  the  direction  of  higher  scores  as  of 
lower. 

The  variability  of  a  group  of  children  is  greater  in  the 
more  complex  tests  than  in  the  simple  ones.  Children 
are  more  alike,  for  instance,  in  the  rate  at  which  they 
copy  figures,  which  involves  merely  control  of  muscular 
action  (coefficient  of  variability  .10)  than  they  are  in  a 
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test  for  rate  of  reading,  which  involves  mental  elements 
(coefficient  .20),  or  in  a  test  of  adding  or  spelling,  which 
measures  the  results  of  a  long  course  of  training  (addi- 
tion .25)  (Table  XIII,  page  469).  The  more  complex  the 
mental  activity  involved,  that  is,  the  more  the  activity  is 
a  direct  product  of  a  long  and  complex  series  of  trainings, 
the  greater  should  be  the  variability,  because  those  of 
greater  capacity  respond  to  training  more  readily.  If  the 
training  were  continued  long  enough,  the  more  able  might 
tend  to  reach  a  maximum  and  thereafter  further  training 
might  operate  to  reduce  variability.  But  at  all  times, 
other  conditions  being  equal,  the  greater  the  complexity 
of  the  ability  tested,  the  greater  the  variability  is  likely 
to  be. 

The  fact  gives  added  significance  to  the  tests  of  simple 
mechanical  skills.  For  Gary,  and  for  other  similar  ex- 
periments, the  claim  is  sometimes  made  that  they  are 
"attempting  to  meet  new  demands  for  a  more  practical 
education  by  the  selection  and  organization  of  a  cur- 
riculum strictly  in  terms  of  the  activities  and  environ- 
ments of  both  children  and  adults."1  In  such  experi- 
ments, however,  the  aims  which  are  set  up  as  desirable 
are  always  less  clearly  defined  and  very  much  more  com- 
plex than  the  simple  work  in  conventional  schools.  It 
is  extremely  probable,  therefore,  that  unless  measure- 
ment proves  that  the  new  curriculum  is  so  well  organized 
and  administered  that  it  controls  more  effectively  the 

1Control  of  Educational  Progress  through  Educational  Experimen- 
tation.   School  and  Society,  Vol.  1,  No.  126,  Meriam. 
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forces  acting  in  the  relatively  simple  problems  of  equip- 
ping children  with  the  essential  tools  by  which  all  men- 
tal work  is  done,  the  efficiency  of  the  work  directed 
toward  the  attainment  of  the  higher  ends  will  be  corre- 
spondingly lower.  So  far,  no  schools  seem  to  have  met 
successfully  this  test  of  controlling  the  simpler  and  more 
mechanical  phases  of  education.  Yet  educational  prob- 
lems are  not  to  be  considered  solved  because  one's  aim 
is  worthy  or  one's  theory  plausible.  In  the  opinion  of 
the  writer,  to  be  considered  wholly  successful,  classroom 
teaching  must  result  in  a  reduction  of  the  present  vari- 
ability in  the  final  product. 


XI.    Statistical  Terms  and  Methods 

An  important  detail  of  survey  work  is  the  choice 
of  methods  by  which  the  results  of  educational  tests  are 
prepared  for  publication.  Readers  unfamiliar  with  sta- 
tistical procedure  will  find  in  appropriate  textbooks  full 
explanations  of  the  general  terms  and  methods  used. 
Each  investigator,  however,  is  likely  to  modify  general 
methods  to  suit  his  own  needs.  In  this  section  are  given 
explanations  of  these  methods  which  are  peculiar  to  this 
report. 

MEDIAN  DEVIATION 

The  measure  of  variability  most  frequently  used  in 
this  report  is  median  deviation.  This  measure  bears  the 
same  relation  to  the  deviations  from  the  median  that  the 
median  bears  to  the  scores. 

It  is  found  by  an  approximate  method.  Any  distribu- 
tion of  scores  is  converted  into  a  distribution  of  devia- 
tions as  follows: 

Scores         3    4     5      6      7      8      9   Scores     Median  Score  5.8 l 
Frequency  7    2     5      4      3      3      1    Children 

Frequency  5      6     10     3      1    Children 

Deviations  0      12      3      4    Steps     Median  Deviation  1.7 


*Or  s-7,  if  the  mid  point  of  the  median  score  is  taken. 

472 


APPENDIX  A  473 

The  five  children  whose  scores  are  5  each  are  considered 
to  have  no  deviation  from  the  median,  the  two  children 
whose  scores  are  4  and  the  four  children  whose  scores 
are  6  are  considered  to  form  a  single  group  of  6  whose 
scores  are  one  step  away  from  the  median.  Similarly, 
there  are  ten  children  (7+3)  whose  scores  are  two  steps 
away,  three  children  three  steps  away,  and  one  whose 
score  is  four  steps  away  from  the  median.  The  median 
deviation  is  the  thirteenth  in  order  of  size  and  this  is 
found  in  precisely  the  same  fashion  as  the  median, 
with  one  exception.  The  median  is  considered  as  falling 
at  the  midpoint  of  its  step,  consequently,  the  deviations 
in  the  table  above  represent  the  mid-points  of  the 
steps  and  not  their  beginning.  To  be  consistent  with 
the  practice  followed  in  other  tables  of  the  report,  the 
deviations  should  be  labeled  o,  0.5, 1.5,  2.5  and  3.5  respec- 
tively, but  the  method  is  much  easier  to  explain  in  the 
form  given.    The  corrected  median  deviation  is  found  as 

follows: 5+6=  11, 13 — 11  =  2,  2-r-IO=.2,  I.5+.2  =  I.7. 

The  semi-interquartile  range  is  1.68. 

The  mid-point  of  the  median  deviation  is  1.65. 

CITY   WIDE  SCORES 

City  wide  median  scores  were  derived  from  distribu- 
tions of  individual  scores  formed  by  consolidating  the 
distributions  for  the  different  classes  of  a  given  grade, 
not  by  finding  the  median  of  the  class  medians.  The  A, 
B,  and  C  sections  of  the  classes  in  the  different  schools 
were  first  combined  separately  then  merged  into  the 
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single  distribution  for  the  grade.  The  median  score 
based  on  this  distribution  is  treated  as  the  scores  of  the 
city  for  the  given  grade.1 

GENERALIZED  SCORES 

The  function  of  the  measurements  made  in  the  survey 
was  conceived  to  be  the  determination  of  the  general 
level  of  abilities  of  the  different  grades  at  Gary.  Conse- 
quently, although  the  median  score  of  every  class  in 
every  school  was  found  and  graphed,  in  the  general 
tables  only  the  "generalized  city  wide  scores"  are  given. 
That  is,  small  irregularities  in  the  various  development 
curves  derived  from  city  wide  scores  were  smoothed  out, 
sometimes  by  averaging  the  scores  of  a  given  grade  with 
those  of  the  grade  above  and  below,  sometimes  graph- 
ically, and  sometimes  by  arbitrary  adjustments.  The 
amount  of  such  adjustment  is  everywhere  small  and 
some  measure  of  its  amount  is  always  reported.  The  re- 
sulting generalized  curves,  therefore,  represent  the  gen- 
eral trend  of  the  development  of  ability  at  Gary  minus 
the  small  irregularities  that  normally  occur  in  every 
school  system  from  year  to  year. 

As  an  illustration  of  this  process  of  consolidation  and 
generalization,  the  different  stages  for  the  scores  in  the 
dictation  spelling  tests  will  be  shown. 

In  Table  XIV  are  given  the  class  and  grade  distribution 
for  all  the  fifth  grade  classes  tested  at  Froebel.  In 
Table  XV  the  class  medians  for  all  classes  taking  this 

1See  last  four  columns  of  Table  XIV  on  page  476. 
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test  are  given  in  full,  and  in  Figure  9,  page  479,  the  vari- 
ous school  curves  are  shown  in  relation  to  the  city  wide 
grade  scores  and  the  generalized  scores.1  It  will  be  ob- 
served that  the  generalized  curves  make  for  simplicity  of 
presentation  and  ease  of  understanding  and  that  it  accu- 
rately reflects  the  general  trend  of  the  development. 

CORRELATION 

One  other  type  of  statistical  method  employed  in  this 
report  needs  discussion  and  explanation,  namely,  corre- 
lation. By  correlation  is  meant  the  degree  of  relation 
which  exists  between  two  abilities.  For  instance,  one 
may  well  ask  the  question,  "  Is  good  spelling  dependent 
upon  (more  precisely,  does  it  occur  associated  with) 
good  writing?"  One  would  answer  this  question  with 
an  unqualified  "  yes  "  if  the  best  writer  in  the  class  proved 
to  be  also  the  best  speller,  if  the  second  best  writer  the 
second  best  speller,  and  so  on,  the  worst  writer  proving 
to  be  the  worst  speller.  Such  correspondence  would  be 
called  perfect  correlation  and  is  practically  never  found. 
The  causes  operating  to  produce  variation  are  too  many 
and  too  varied  to  permit  of  perfect  correspondence,  and 
the  real  problem  is  to  determine  the  precise  degree  of 
relationship  existing  between  the  two  abilities.  For  the 
relationship  may  (theoretically)  vary  from  perfect  cor- 
respondence through  no  correspondence  to  an  inverse 
relationship  in  which  the  best  writer  would  be  the  worst 
speller,  and  so  on  (negative  correlation). 

1See  also  Fig.  14,  page  84. 
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Figure  9 
Relation  Between  Actual  and  Generalized  Scores 
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The  scale  along  the  base  of  the  figure  indicates  grades,  that  along  the 
left  hand  vertical  axis  the  average  accuracy  of  spelling.  Each  X  repre- 
sents a  score  made  by  a  class  in  the  Froebel  school;  each  square  a  class 
in  the  Emerson  school,  each  circle  a  class  in  the  Jefferson  school,  and 
each  triangle  a  class  in  the  Beveridge  school  The  development  curves 
for  the  various  schools  are  shown,  also  the  curve  (double  line)  based  upon 
the  generalized  score. 

The  graph  shows  that  spelling  ability  at  Gary  should  be  represented 
by  a  zone  about  20  per  cent.  wide.  That  is,  in  any  one  grade  the  dif- 
ference between  the  best  and  worst  of  the  school  curves  is  approximately 
20  per  cent.  The  curve  based  on  the  generalized  scores  runs  through 
the  centre  of  the  zone. 

For  other  illustrations  of  relation  between  actual  and  generalized 
curves,  see  Figure  2,  page  19;  Figure  14,  page  84;  Figure  52,  page  295. 

In  this  report  correlation  is  used  only  to  bring  out  cer- 
tain general  relations  between  tests.    No  extended  or 
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careful  studies  are  attempted.  Consequently,  the  method 
of  correlation  used  and  the  precision  of  the  results  are 
of  little  moment;  it  is  the  general  degree  of  correlation 
only  that  matters.  Accordingly,  a  special  method  of 
correlation  has  been  employed,  a  method  which  lends 
itself  to  simple  definition  and  to  graphic  presentation. 
The  details  of  this  method  will  now  be  explained. 

For  the  purposes  of  this  report,  correlation  may  be 
defined  as  the  extent  to  which  the  members  of  any  group 
maintain  the  same  relative  position  within  the  range  of 
scores  for  one  test  that  they  hold  in  the  scores  for  a 
second  test,  the  extent,  that  is,  to  which  the  children 
that  are  high,  average,  or  low  in  one  test  are  high,  average, 
or  low  in  the  other  also. 

The  degree  of  correlation  is  measured  by  a  coefficient 
of  correspondence.    That  is, 

(i)  The  position  of  each  child  in  the  group  in  both  tests 
is  expressed  in  common  terms,  and  (2)  from  a  comparison 
of  these  common  measures  of  position,  (3)  the  percentage 
of  children  who  maintain  the  same  relative  positions  in 
the  two  distributions  is  easily  determined  by  counting 
and  computation.  This  percentage  will  be  taken  as  the 
coefficient  of  correspondence. 

Measurement  of  relative  position  is  made  in  terms  of 
units  of  variability.1  A  score  which  is  either  higher  or 
lower  than  the  median  by  an  amount  equal  to  the  median 


'See  Thorndike  Mental  and  Social  Measurements,  page  158,  and 
Comparable  Measures,  Kelly,  Journal  of  Educational  Psychology, 
December,  1914. 
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delation  istfonsidered  at  a  unit  distance  from  the  medians 
That -ig,  tike  iteps  in  finding  such  measures  of  relative 
positions  are  (r)  'finding  the  deviation  of  each  score  from 
the  median,  demoting  differences  derived  from  scores 
larger  than  the  median  by  plus  signs  and  from  scores 
smaller  than  the  median  by  minus  signs,  (2)  finding  the 
irtedian  of  these  deviations,  and  (3)  dividing  each  devia-* 
tion  by  the  median  deviation,  carrying  the  results  to 
tenths.  The  resulting  ratios  are  the  measures  of  relative 
position  tfisired: 1   •'.•■'  .'  j  .:  i  'l  . ; 

-The  method!  of  transmuting  scores  in  different  tests 
teto  '  comparable  measures  of  relative  position  rests 
upon  the  assumption  that  the  range  of  scores  in  one  test 
necessary  to  include  the  half  of  the  cases  which  vary 
least  f  norm  the  median  is  equivalent  to  the  range  of  scores 
in  the  other :  test  necessary  to  include  the  corresponding 
cases*  As:  scores  in  different  tests  are  not  directly  com- 
par&bfe,  ■ 'Some  such  assumption  must  be  made.  For. 
educational  work  the  median  deviation  seems  to  the 
author  to  be  the  most  suitable  measure  of  variability  to  use. 
Thfe  fihaj  step  in  determining  the  coefficient  of  corre- 
spondence is  the  comparison  of  the  variability  ratios. 
Ii  child  A  is  2.1  units  above  the  median  in  one  test  and 
2.0  units  above  the  median  in  the  other,  it  is  apparent 
at  once  that  he  has  maintained  the  same  position  in  the 
tWo  tests  within  a  tenth  of  a  unit.  Chi  the  other  hand, 
if  child  B  is  +  a.i  in  one  test  and  —  1.7  in  the  other  test, 
it  is  equally  dear  that  he  has  changed  his  position  in 
the  group  38  units.    The  number  of  children  out  of  the 
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entire  group  who  maintain  the  same  positions  within 
any  desired  degree  of  exactness  is  thus  easily  determined, 
and  the  per  cent,  that  this  number  is  of  the  total  number 
in  the  group  constitutes  the  coefficient  of  correspondence. 

The  size  of  the  coefficient  will  depend  not  only  upon 
the  real  degree  of  correspondence  between  the  two 
abilities  tested  but  also  upon  the  reliability  of  each  test. 
For  if  the  conditions  of  the  testing  are  such  that  large 
chance  variations  may  occur,  exact  correspondence 
could  not  be  expected,  even  though  the  relationships 
between  the  two  abilities  were  perfect.  In  the  previous 
chapter,  the  chances  that  variation  in  score  will  occur 
were  shown  to  be  very  great,  and  in  general  it  is  found 
that  even  for  the  most  perfect  tests,  closer  range  of 
correspondence  than  one  unit  is  not  to  be  expected. 
In  this  report,  the  coefficients  of  correspondence  are,  in 
general,  taken  arbitrarily  as  the  percentages  of  the 
groups  which  maintain  their  relative  positions  in  the  two 
distributions  within  one  unit  of  variability. 

Empirical  experimentation  proves  that  in  certain 
cases  the  coefficient  of  correspondence  agrees  quite  closely 
with  the  Pearson  product-moment  coefficient  of  correla- 
tion,1 but  that  in  other  cases  there  is  little  resemblance. 
In  this  report,  however,  the  Pearson  coefficient  (based  on 
the  median  as  the  measure  of  central  tendency)  is  given  in 
all  situations  in  which  it  might  seem  to  be  at  all  significan  t . 

'For  a  normal  distribution  it  is  probable  that  a  coefficient  based  on  the 
number  of  cases  which  maintain  the  same  position  within  three  quarters 
of  a  unit  of  variability  would  correspond  more  closely  with  the  Pearson 
product-moment  coefficient  based  on  the  same  scores. 
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This  appendix  contains  directions  for  giving  the  various 
tests,  and  illustrations  of  actual  papers  written  by 
children,  of  answers  and  score  cards,  and  of  tabulation 
sheets. 

I.  Handwriting 485 

II.  Spelling     .     , 494 

in.  Arithmetic 498 

IV.  Composition 508 

V.  Reading 514 


I.    HANDWRITING 

FREE  CHOICE  TEST 

Instructions. — On  the  day  before  the  test,  give  to  each 
teacher  a  copy  of  the  test  paper.  Tell  her  that  her  class 
will  be  tested  in  handwriting  the  next  day,  and  ask  her 
to  have  the  class  practice  writing  the  test  paragraph  so 
that  the  children  may  become  familiar  with  its  contents. 

On  entering  a  room,  say:  "We  are  going  to  have  a 
handwriting  test  to-day.  Please  take  out  your  pens  and 
blotters  and  get  ready  to  write.  I  will  give  each  of 
you  one  of  these  papers  (holding  the  paper  before  the 
class)  but  do  not  write  anything  until  I  ask  you  to." 

When  all  are  supplied,  say:  "Fill  out  the  blanks  at 
the  top  of  the  paper;  your  name,  boy  or  girl,  age,  grade, 
and  class.  Now  lay  your  pens  down  and  read  the  in- 
structions out  loud  with  me." 

Read  the  instructions  with  the  class,  making  sure 
that  all  take  part.  Then  say:  "The  instructions  mean 
that  when  I  say  'Start'  you  are  to  copy  as  much  of  the 
paragraph  (pointing)  as  you  can  in  the  time  allowed. 
You  will  be  marked  both  for  how  much  you  write  and 
how  well  you  write.  Do  not  begin  until  I  say  'Start/ 
and  stop  as  soon  as  I  give  the  signal." 

48s 
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When  all  are  ready,  say:  "Get  Ready.  Hands  up. 
Start."  Allow  exactly  two  minutes.  Then  say  "  Stop." 
Have  the  papers  exchanged. 

Pass  the  answer  cards,  have  the  first  two  lines  filled 
out,  then  have  the  cards  exchanged  to  correspond  with 
the  papers.  Show  the  children  how  to  count  the  number 
of  letters  written  by  means  of  the  answer  cards.  Have 
the  score  written  on  the  card  in  the  space  provided. 
Collect  cards  and  papers. 

DICTATION  TEST:  HANDWRITING  AND  SPELLING 

Instructions. — On  entering  a  room,  say:  "We  are  to 
have  a  dictation  spelling  test  to-day.  Take  out  your 
pens  and  blotters,  and  get  ready  to  write.  I  will  give 
each  of  you  a  sheet  of  this  paper.  Please  do  not  begin 
to  write  until  I  tell  you  to." 

Distribute  the  test  papers.  When  all  are  supplied 
have  the  blanks  at  the  top  of  the  paper  filled  out. 

Then  say:  "I  am  going  to  dictate  ten  sentences. 
Lay  your  pens  down  while  I  read  them  to  you,  so 
you  will  understand  them  easily  when  I  ask  you  to 
write." 

Read  the  sentences  at  the  rate  of  about  ten  letters 
per  second  (slowly).  Then  say:  "Now  take  your  pens 
and  get  ready  to  write.  I  shall  not  repeat  at  all,  so  listen 
carefully.  If  you  have  not  finished  a  sentence  when  I 
start  dictating  the  next,  stop  writing,  omit  the  rest  of 
the  sentence,  and  listen,  but  try  to  write  fast  enough  to 
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Answer  and  Record  Card1 
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H^-f^^,  ^y^y^*-^  Test  9 
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Total  number  of  letters  copied  in  two  muiutes_^!.^_ — 

9  7  8 

Quality     ^/O  4>S~        >?^    .    Rate_if^_ 

Median 
Scored  bv Quality 


'Under  quality,  the  score  under  nine  is  the  quality  score  for  the  Free 
Choice  Test,  under  seven,  for  the  handwriting  in  the  Dictation  Tests,  under 
eight,  for  the  handwriting  in  the  Composition  Test. 

(The  small  figures  at  the  end  of  any  word  show  the  total  cumber  of 

letters  up  to  the  end  of  that  word.) 

Fourscore  9  and  12  seven  17  years  &  ago  25  our  28  fathers  35  brought «j 
forth  47  upon  51  this  «  continent  M  a  a  n*w  w  nation  74  conceived  tt  in 
liberty  92  and  Vt  dedicated  10«  to  1:3  the  109  proposition  120  that  m  iDm 
men  ico  are  us  created  140  equal.  145  Now148  wem  &icm  engaged  i» 
in  162  a  las  g^  ics  civil  in  war  17fi  testing  183  whether  190  that  m  natwn  » 
or  J02  any  206  nation  2n  so  213  conceived  222  and  225  so  227  dedicated  M  can  m 
long  248  endure.  249  We  2&1  are  254  met  257  on  #*  a  *>  great  2©  battlefield  231 
of^^that^  war.286 
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keep  up.  Now  give  me  your  best  writing  and  speB- 
mg. 

When  the  second  hand  of  the  watch  reaches  the  sixty 
second  mark,  read  all  of  the  first  sentence  and  tell  the 
children  to  "write  it."  Wait  until  the  second  hand 
reaches  the  position  shown  by  the  figures  in  a  parenthesis 
just  before  the  second  sentence,  then  read  that  sentence, 
and  so  on  through  the  test,  the  children  writing  the  sen- 
tences during  the  intervals  between  dictation. 

At  the  conclusion  of  the  test  pass  the  answer  cards 
and  have  the  blanks  in  the  first  two  lines  filled  out 
Collect  cards  and  papers. 

In  the  fourth,  sixth,  and  eighth  grades  give  both  the 
easy  and  the  difficult  words. 
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n.    SPELLING 

LIST  TEST 

Instructions. — On  entering  a  room,  give  the  teacher  an 
answer  card  for  the  test  and  tell  her  that  as  soon  as  the 
children  are  ready,  you  are  going  to  ask  her  to  dictate 
the  words.  Give  her  no  further  instructions.  If  she 
asks  any  questions  about  the  giving  of  the  test,  tell  her  to 
follow  her  usual  practice. 

Then  say  to  the  children:  "We  are  going  to  have  a  spell- 
ing test  to-day.  I  am  going  to  ask  your  teacher  to  dic- 
tate twenty  words  to  you,  and  you  will  write  them  on 
these  slips  of  paper." 

Distribute  the  test  papers  and  have  the  blanks  filled 
out.    Next  ask  the  teacher  to  dictate  the  test  words. 

When  she  has  finished,  have  the  papers  exchanged 
twice;  then  distribute  the  answer  cards  for  the  test  and 
have  the  blanks  filled  out.  Have  the  cards  exchanged  to 
correspond  with  the  papers,  and  in  the  lower  grades 
collect  the  cards  and  papers  together.  In  the  upper 
grades  the  papers  are  to  be  corrected  by  the  children; 
each  word  being  marked  (x)  or  (c),  (for  wrong  or  right), 
the  marks  to  be  made  on  the  cards,  and  not  on  the  papers. 

Collect  the  cards  and  papers  together. 
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m.  ARITHMETIC 

SERIES  B  (COURTIS) 

Imbmtimy.—CkL  entering  a  roam,  ask  the  teacher  for 
permission  to  give  a  test  and  make  sure  that  the  children 
are  provided  with  pencils. 

-Say:  *% We  are  going  to  have  arithmetic  tests  this  week. 
To-dav  we  are  to  have  addition  and  subtraction;  to- 
morrow  multiplication  and  division.  Please  do  not  turn 
these  papers  over  (showing  the  two  sides  of  the  test)  until 
we  are  ieadv." 

Then  distribute  the  papas.  Have  the  blanks  filled  in. 
Read  the  instructions  out  loud  before  giving  the  warning 
signal;  **Get  Reach*.  Hands  up";  then  when  the  bell 
rings,  "Start."  Observe  the  time  intervals  given  in  the 
instructions  and  have  the  timing  checked  by  the  teacher. 

Give  the  stopping  signal  by  bell,  but  say  also:  "Stop. 
Hands  up.    Make  a  cross  by  the  last  example  finished." 

Distribute  the  score  cards.  Have  the  blanks  filled  in, 
writing  the  number  of  the  class  after  "Grade."  Have 
the  papers  and  cards  exchanged  twice.  Read  the 
answers  from  the  card,  letting  the  children  check  the 
nramplrs  right  (c)  or  wrong  (x)  on  the  score  card  in  the 
column  marked  "Check."    Part  of  an  answer  does  not 
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count.  Find  the  total  number  of  the  examples  tried  and 
the  total  right,  writing  these  scores  on  the  face  of  the  card 
also. 

Collect  the  cards  by  rows. 

Have  the  papers  exchanged  again.  Distribute  new 
score  cards,  and  again  have  the  papers  scored. 

Collect  both  cards  and  papers  by  rows. 

Pass  the  papers  for  the  subtraction  test.  Have  the 
blanks  filled  in  and  read  the  instructions  out  loud.  Give 
the  test  as  before. 

Pass  the  score  cards  and  have  the  blanks  filled  in,  then 
collect  both  cards  and  papers. 

series  b  (Cleveland) 

Instructions. — On  entering  a  room  say  to  the  children: 
"We  are  to  have  an  arithmetic  test  to-day.  These 
papers  contain  five  short  tests.  The  first  set  of  examples 
(point)  is  called  set  C;  second,  set  H;  the  others,  sets  G, 
O,  and  L.  Please  do  not  look  at  any  of  these  examples 
until  we  are  ready.  As  soon  as  you  get  a  copy  of  the 
tests,  fill  out  blanks  on  cover. 

Distribute  the  papers. 

When  I  say:  "Get  ready,"  raise  your  pencil  hand, 
take  hold  of  cover  with  the  other  so  you  can  open  test 
quickly  when  I  say:  "Start."  When  I  say:  "Stop," 
close  your  papers  and  stand  beside  your  desk.  The 
first  test  is  a  test  in  the  multiplication  tables.  It  is  called 
set  C  and  comes  at  the  top  of  the  page.  Write  the  an- 
swers on  the  paper  directly  underneath  the  examples. 
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Now  lay  your  pencils  down  while  we  practice  starting— 
"  Get  ready,  start,  stop,  close  papers,  stand." 

When  all  understand,  give  the  first  test,  warning  the 
children  to  multiply  just  before  giving  the  signal:  "Get 
ready."  Have  children  put  a  circle  around  the  last  prob- 
lem completed  when  they  stop. 

While  the  children  are  standing,  say:  "The  next  set 
of  examples,  Set  H,  on  the  lower  half  of  this  page,  con- 
tains easy  examples  in  the  addition  and  subtraction  of 
fractions.  Watch  the  signs  and  do  what  they  tell  you  to 
do.  If  you  cannot  work  these  examples,  close  your 
papers  and  wait  for  the  next  test." 

Give  the  test. 

For  the  next  test,  say:  "Set  G  is  another  multiplica- 
tion test.  Write  your  answers  directly  underneath  the 
examples. 

Give  the  test. 

The  next  test,  Set  O,  on  lower  half  of  cover  page,  is 
made  up  of  examples  in  addition,  subtraction,  multipli- 
cation, and  division  of  fractions.  Watch  the  signs  and  do 
what  they  tell  you  to  do. 

Give  the  test. 

Set  L,  the  last  test,  is  on  the  back  of  the  paper,  and  is  a 
test  in  long  multiplication. 

Give  the  test. 

Distribute  cards,  one  set  at  a  time,  and  have  blanks 
filled  out.  Then  place  all  cards  inside  folder.  Collect  in 
grades  three  and  four.  In  other  grades  exchange  twice 
and  score  in  this  order:  L,  G,  and  C  as  far  as  time  permits. 
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IV.    COMPOSITION 

Instructions. — Give  list  of  subjects  to  teacher  to  write 
onboard. 

Say  to  the  children:  "I  want  to  find  out  to-day  what 
kind  of  a  composition  you  can  write.  I  am  going  to  ask 
you  to  write  a  story  about  an  interesting  experience 
that  you  or  your  friend  may  have  had  sometime  or  other. 
These  are  to  be  your  own  stories;  nothing  that  you  have 
just  read  somewhere  or  seen  at  a  moving  picture  theatre. 
A  real  story  will  probably  be  best,  but  if  you  cannot  write 
a  real  story,  you  may  make  up'one  of  your  own.  Make  it 
as  interesting  or  as  exciting  as  you  can.  Your  teacher 
has  written  some  suggestions  on  the  board  to  help  if 
you  cannot  think  of  anything  yourself."    Read  them. 

An  Interesting  Experience. 

A  Storm. 

An  Accident. 

A  Runaway. 

An  Errand  at  Night. 

An  Unexpected  Meeting. 

On  the  Ice. 

In  the  Woods. 

On  the  Water. 

In  the  Mountains. 
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"However,  you  do  not  have  to  use  any  of  these  subjects 
unless  you  want  to.  Will  those  in  the  front  seats  please 
distribute  the  papers  for  me?  " 

Fill  in  the  blanks. 

"I  am  going  to  ask  you  to  start  together.  You  may 
spend  the  first  few  minutes  thinking  of  what  you  are 
going  to  say,  if  you  wish.  I  shall  give  you  about  twenty 
minutes  in  which  to  write  your  story.  If  you  need  more 
paper  than  the  sheet  which  you  have,  raise  your  hand 
and  I  will  bring  you  another.  If  you  should  finish  your 
story  before  the  time  is  up,  please  let  me  know  by  raising 
your  hand." 

Start — allow  twenty  minutes — stop. 

Have  children  count  words  written. 
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phrase,  only  one  error — the  last — should  be  counted. 
(For  instance,  a  mispronunciation  and  a  repetition.) 

A  single  word  repeated  once  is  not  counted  a  repetition, 
but  if  it  is  repeated  more  than  once  it  is  a  repetition. 

Watch  for  the  substitution,  insertion,  or  omission  of 
letters  as  well  as  of  words  and  syllables. 

Keep  time  to  the  nearest  seconds  only. 

Place  each  pupil  under  each  of  the  following  cate- 
gories: Expression:  No  expression  (N.  E.),  mechanical 
(M.  E.),  intelligent  (I.  E.).  Type  of  reading:  Word  (W), 
sentence  (S),  paragraph  (P). 

REPRODUCTION  TEST 

Instruction. — The  Reading  and  Reproduction  Tests 
are  to  be  given  as  follows:  "Little  Baby  Bear"  in  grades 
two,  three,  four,  and  five;  "Nothing  on  the  Breakfast 
Table,"  in  grades  five,  six,  seven,  and  eight;  "An  Acci- 
dent" in  grades  eight,  nine,  ten,  eleven,  and  twelve; 
"Two  Ways  of  Asking  a  Favor"  in  grades  four,  five,  six, 
seven,  and  eight. 

In  second  grade  classes  no  attempt  need  be  made  to 
have  the  children  reproduce  the  story,  if  in  the  judgment 
of  the  teacher  they  lack  ability  to  write  connected  sen- 
tences. 

Time  allowance  for  "Little  Baby  Bear"  is  one  minute 
for  second  and  third  grades,  thirty  seconds  for  fourth  and 
fifth  grades;  for  "Nothing  on  the  Breakfast  Table"  one 
minute;  for  "An  Accident,"  one  minute;  and  for  "Two 
Ways  of  Asking  a  Favor,"  forty  five  seconds. 
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On  entering  a  room,  say:.  "I  have  a  reading  test  for 
you  to-day.  I  am  going  to  ask  you  to  read  a  story  for 
me,  and  will  give  each  of  you  a  copy  of  these  papers, 
but  you  are  not  to  look  on  the  inside  pages  until  I  tell 
you  to  do  so." 

Have  the  papers  passed  by  the  boys  and  girls  in  the 
front  seats.  When  all  are  supplied,  have  the  blanks  at 
the  top  of  the  sheet  filled  out  and  the  instructions  read 
out  loud. 

Say:  "These  instructions  mean  that  when  I  say 
'Start'  you  are  to  read  out  loud  with  me  the  title  and 
everything  that  is  printed  on  this  front  page  (pointing). 
When  I  say:  'Turn/  you  are  to  turn  the  page  and  read 
the  rest  of  the  story  to  yourself.  You  should  read  only 
as  rapidly  as  you  can  understand  what  you  read,  for 
when  you  finish  I  am  going  to  ask  you  to  turn  to  the  last 
page  (illustrating)  and  write  in  your  own  words  as  much 
of  the  story  as  you  can  remember.  When  I  say:  'Stop* 
I  am  also  going  to  say:  'Draw  a  line  around  the  last 
word  read.'  That  means  that  if  you  are  reading  a  word 
in  the  middle  of  a  sentence,  for  instance,  you  are  to  draw 
a  line  around  it  like  this  (illustrating  on  board  or  test 
paper),  to  show  me  just  where  you  stopped.  Now  we 
are  ready  to  try  it." 

Say:  "Start"  and  begin  to  read  the  title  and  story. 
At  the  bottom  of  the  first  page  say:  "Turn"  and  start 
the  stop-watch.  Exactly  the  proper  number  of  seconds 
later  say  "Stop.  Draw  a  line  around  the  last  word 
read;  turn  to  the  last  page  and  begin  to  write  as  much  as 
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you  can  remember  of  the  story  you  have  just  read." 
Start  the  stop-watch  again;  exactly  three  minutes  later 
say:  "Stop.  Put  a  cross  on  the  paper  after  the  last  word 
written  and  go  on  writing.,,  At  the  end  of  five  minutes 
say:    "Stop." 

Collect  the  papers. 

Where  a  class  takes  two  or  three  of  the  tests  they  are 
to  be  given  in  succession  in  the  same  class  period. 

KANSAS  SILENT  READING  TEST 

Instructions. — On  entering  a  room,  say  to  the  teacher: 
"May  I  test  this  class  this  hour?"  Upon  receiving  per- 
mission, make  sure  the  class  is  the  one  called  for  on  your 
schedule. 

Have  the  children  clear  their  desks,  and  make  sure 
each  is  supplied  with  a  sharp  pencil. 

When  all  are  ready,  say:  "I  am  going  to  give  you  a 
new  test  to-day.  This  has  instructions  on  the  front 
(holding  paper  up  for  inspection),  questions  on  the  inside 
pages  (opening  the  test),  and  on  the  back.  Please  do 
not  open  the  papers  or  look  at  the  questions  until  I 
give  you  the  proper  signal." 

Distribute  the  papers. 

Have  the  children  fill  in  name,  age,  grade — only. 

Read  the  instructions  out  loud  with  the  children,  hav- 
ing them  actually  draw  a  line  around  the  word  "cow" 
at  the  point  where  the  instructions  say  to  draw  the  line. 

For  grades  three  to  eight  the  time  interval  is  to  be  five 
minutes.    For  grades  nine  to  twelve  the  time  interval  is 
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The  form  for  the  other  reproduction  test  was  the  same  as  in  Test  No.  i . 
The  contents  of  the  other  tests  were  as  follows: 


NOTHING  ON  THE  BREAKFAST  TABLE 


Fradcaaatdtfwaktetol 
tteofharaaaa4wnartWfc*lyw«Kth»r^.adh»dl 
tlwfritapitllndwtka.  Bat  tkaagh  ha  had  ihpi  kte.  Frad  waa 
atal  afctpy-.  lor  Im  fed  atald  ap  aaafl  taxjaa  o'aftaak  tha  Bight  he> 
Cor«.«iMfmWvMBnid(rbb^bra«M.TetaOtltttnta.kfv«i 
'rather  cw,  at  an*  bay*  «•  apt  to  be  whoa  thay  art 
at  ha  teak  hit  tart.  Im  aaid:  Tthawl  tarn's  aotk- 
■Jf  oa  tat  btaalaatt  tabla." 

Tbaa  ha  eaaad  kah>.  "Kate,  Kate."  Bat  a*  Kato  laaftacV- 
tor  tat  unliat  raataa  tart  tha  aWt  hrar  aha.  8ha  waa  aat 
ktlaiaoahryycrdfaoiiagthtdkkfcaa*.  So  Frad  Inatd  wimid 
with  •  aary  afeaaateatod  taaiiiiiiia  aa  hk  faet,  aad  efeaad  hk 
i;  bat  at  taea  opaaad  thorn  afaja.  tad  baaaa  to  aaaaaa.  A 
•  had  tieklad  kit  i 


!aar-ekawlk»>ehawf  What  klkt  world  did  tkatT 


1  oH"  lapfiad  a  abatp  ktttr  vatea.  aad  tam  oa  tat  tabk 
befojahaa  atoad  a  naaO  caaataaa  drattad  it  grata  aad  waariag 
tl»brk>teatafradeapa. 

Aad  who  aw  year  aafcad  Frad. 

TaPaaear.  Karidf  oa  tat  bratUttt  tabk.  hey?  I  cal  It 
dtddadb/  aagrataf al  af  yoa  to  tay  that,  whea  than  at*  a  awabar 
of  thaaga  hart  biaugbt  fraai  all  parte  af  tat  world  to  tarta  yarn. 
I.  for  kateata,  caaat  way  boat  tha  Eaat  Iadiat  to  htlp  taaaaa  yaw 
food.  Oaaa^aidtappaatfT^yoawooidUtaawwhatabUaad 
toa^aMavaidaataaaBtataaaothaax." 

-raat'ttiaa;"JaBtodk*t»ioBdio^ToW.aiJmaoOMr«DAlI 
IfpM,  waariag  a  pan  white  dtatt  dotted  with  tbioiag  cryatak. 
aadji  waaafh  af  what  ataaaai  la  ha  aaay^aawfakaa.  aprat«  fraai 


"Aad  who  art  yoor  tab!  Frad. 

"I  am  Safe."  aaaM  tha  aaavar  h  efear  tooaa.    "I  hata 
fata  daap  aaaat  aad  daap  watata  to  waH  aaaa  yoa.   What  yoa 
woaM<wwitboataaI<waotka0w;rWyoaraqaira 
aaf.  aaaa  aad  aight.  Aai  I  thaa  to' ha  riiiaii  at  aathhajT 

"Aad  a»  I  aathJagT  aahad  aaathar  tiay  torn  that  atapptd 
fraai  tha  braad  plate  aad  bowad  frnMOy.  1  a*  Whaat.  aad 
yoa  hava  aw  to  thaah  far  Broad,  which  aaaaf  yaw  wfia  oaatbaa 
aaidktbaatefoflfc. 

-Aad  how  about  ajar  oatttd  tht  awaatatt  vatea  of  al  fata  tha 
tap  af  tha  tyrap  jag.  whtra  tat  a  btwtra4acad'oV  k  a  tart  IB* 
jointed  armor,  a  lowar  ta  oaa  haad.  a  paaaJah  ttlek >  tat  othar. 
*lbalaagtotagareaaatadIataMi^tWWattIaditetoa>o 
yoa  tyrap  tor  yaar  grildk  cahaa  aad  aagar  tor  yoar  caBaa." 

M.\ad  wa."  apaka  two  aaart  paitf  -wot  araataaaa  at  tha  aaaa) 
braath,  aa  thay  paapad  boat  batted  tha  oafaa  f^  "Waa  ImnM 
fraai  Java  aad  Atahfc  to  brJayyaa  plinan.^ 

1  didat  tatta    "  Frad  atertoi  to  araka  iapr/a  hat  taddaarj 
hit  ttttk  vkitert  had  ditapptarad  aad  Kate  was  briaajag  hi  bit 
braaMatt;crkpaBctoafbuttetadteaa^^ 
aahat  with  gahka  tyrap.  aad  a  aap  af  Uimiag  aaSta.    t)a, 
taid  Frad.  "haw  orach  thaat  k  oa  tha  braahftat  tohb  alter  aa\* 

Sat 


Silent  Reading 
The  Accident. 

I  had  not  walked  far  into  the  country,  before  I  found  work  fin 
Brown's  brkfcyari  at  Itt  a  month  and  board.  It  was  my  task  to  cart 
day  m  the  afternoon  to  fill  up  tba  *ipiti,%  which  had  been  emptied  dur* 
kig  the  atoning.  It  was  an  idle  enough  kind  of  job.  AU  I  bad  to  do 
was  to  walk  alongside  my  hone,  a  big  white  beast  with  no 
Joint*  at  all  except  where  its  legs  were  hinged  to  the  backbone, 
hack  it  up  to  the  pit,  and  dump  the  load.  Bat,  walking  so>in 
the  autumn  sun,  I  feO  adreaming.  I  forgot  daybank  and  pit; 
f  was  back  in  the  old  town,-7*aw  my  sweetheart  play  among  the 
timber.  I  met  her  again  on  the  Long  Bridge.  I  held  her  hands 
once  more  in  that  last  meeting— the  while  I  was  mechanically 
backing  my  load  up  to  the  pit  and  making  ready  to  dump  it. 
Daydreams  are  out  of  place  in  a  brickyard.  I  forgot  to  take  out 
the  tail-board.  To  my  amasement  I  beheld  the  old  hone  skating 
around,  making  frantic  efforts  to  keep  its  grip  on  the  soil,  then 
slowly  rise  before  my 'bewildered  gate,  clawing  feebly  at  the  air  as 
it  went  up  and  ovcr.<paclcward^  into  the  pit,  loacVcart  and  aO. 

I  wish  for  my  own  reputation  that  t  could  truly  say  I  wept  for 
the  poor  beast  I  am  sure  I  felt  for  it,  but  the  reproachful  look 
it  gave  me  as  it  lay  there  on  its  back,  its  four  feet  pointing  sky- 
ward, was  too  much.  I  sat  upon  the  edge  of  the  pit  and  shouted 
with  laughter,  feeling  thoroughly  ashamed  of  my  levity.  Mr. 
Brown  himself  checked  it,  running  in  with  his  two  sons  and  de- 
manding to  know  what  I  was  doing.  They  had  seen  the  accident 
from  the  office,  and  at  once  set  about  getting  the  horse  out.  That 
was  no  easy  matter.  It  was  not  hurt  at  all,  but  it  had  faUeojo 
as  to  bend  one  of  the  shafts  of  the  truck  like  a  bow;  It  had  to  be 
aawed  in  two  to  get  the  horse  out  When  that  was  done,  the 
heavy  ash  stick,  rebounding  suddenly,  struck  one  of  the  boys  who 
stood  by  a  blow  on  the  head  that  laid  him  out  senseless  beside  the 
cart 

It  was  no  time  for  laughter  then.  We  ran  for  water  and  restor- 
atives, and  brought  him  to,  white  and  weak.  The  horse  by  that 
time  had  been  lifted  to  his  feet  and  stood  trembling  in  every  hmb, 
ready  to  drop.  It  was  a  sobered  driver  that  climbed  out  of  the 
pit  at  the  tail  end  of  the  procession  which  bore  young  Brown  home. 
I  spent  a  miserable  hour  banging  around  the  door  of  the  house 
waiting  for  news  of  him.  In  the  end  his  father  came  out  to  oom- 
fort  me  with  the  assurance  that  he  would  be  all  right  I  was  not 
even  discharged,  though  I  was  deposed  from  the  wagon  to  the  com- 
mand of  a  truck  of  which  I  was  myself  the  hone,  I  "ran  out"  brick 
from  the  pit  after  that  in  the  morning. 


$22 


Silent  Reading 

TWO  WAYS  OF  ASKING  A  FAVOR 


The  lion-tamer  stood  leaning  against  one  of  his  cages.    As 

the  two  young  officers  came  up  to  him,  he.  folded  his  arms 
ancj  watched  them  idly.  Duke  Alexander  looked  him  over 
carefully  before  he  spoke;  but  Prince  Michael  began  curtly: 

"Are  you  Romanoff,  the  lion-tamer  V 

Romanoff  nodded.    "Yes,"  he  said,  quietly 

"lam  Prince  Michael  of  the  Emperor's  Guard.  The  Grand 
Duke  Orders  you  to  come  to-morrow  to  his  place." 

Romanoff  looked  at  him  from  under  his  long  lashes.  "What 
is  the  Grand  Duke  to  me,"  he  said,  "or  what  am  I  to  him?  I 
am  here  to  look  after  mv  lions." 

Prince  Michael's  face  flushed  scarlet  and  the  hot  blood 
mounted  to  his  eyes.  He  took  a  step  forward,  but  his  cousin 
pulled  at  his  arm. 

"You  are  quite  in  the  wrong,  Michael/'  he  said.  "That  is 
no  way  to  ask  a  favor.  Go  over  t^here  while  I  speak  with  him." 

Michael  walked  over  to  the  entrance  of  the  tent  and  waited 
sullenly;  but  the  young  Duke  looking  at  Romanoff,  said:  "Do 
not  mind  him.  He  is  only  a  great (School-boy,)  It  is  not  the 
Grand  Duke  who  wants  you.  It  is  my  sister,  the  little  Duchess 
Vera,  She  is -very- lame,  and  walks  on  crutches.  She  suffers  a 
great  deal.  She  came  to  the  circus  to-day,  and  saw  you;  and 
now  she. wants  you  to  come  and  tell  her  all  about  the  lions. 
Set  your  own  price.  Whatever  your  time  is  worth,  i  will  pay 
you.    I  can  refuse  her  nothing." 

A  pleasant  smile  lighted  up  Romanoff's  dark  face,  showinghis 

firm  white  teeth,and  the  kindly  lines  in  the  cornersof  hismouth. 

'What  my  time  is  worth  is  my  own  .affair,"  he  said.    "You 

could  not  pay  me  for  going;  but  if  the  little  lady  is  lame  and  ill, 

and  wants  me,  I  will  go  gladly." 
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Answer  and  Record  Card 
Silent  Reading  Test  No.  1  £ME*S<M 


*Z^T&7£ Test  No.  13 


kst-LJZ Gr«de_j^=i— CI»m  J#- 


Number  of  Words  *—**'   '*?   /*o  K»te/T*    /rs  MA 

t-         %  >-  * 

Number  of  Words  Written^L__ZZ_R*te_A*. 
Quality  of  RftTftdVt**iMi         .  , 


JO 


4* 


to  be  three  minutes.  Set  the  clock  at  12:0a  Start  with 
bell  and  stop  with  bell.  Have  teacher  time  test  with 
stop-watch.  Record  the  time  interval  reported  by 
teacher.  For  each  test,  give  the  warning  signal:  "Get 
ready.  Hands  up/'  and  the  stopping  signal:  "Stop. 
Put  a  cross  in  the  right  hand  margin  opposite  the  last 
question  you  have  answered.    Close  your  papers." 

Distribute  score  cards.  Have  name,  sex,  age,  and 
grade  filled  in. 

For  grades  three,  four,  five,  and  six  have  the  score 
cards  placed  within  the  test  paper.    Collect  the  papers. 

For  grades  seven  to  twelve  distribute  the  score  cards. 
Have  name,  sex,  age,  and  grade  filled  in,  then  have  the 
pupils  exchange  papers  and  cards  (twice). 
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Then  say:  "Here  b  a  large  score  card  like  the  small 
one  you  have.  Watch  me  score  a  paper.  Hie  correct 
answer  to  the  first  one  is  'yeDow'  (pointing).  It  is 
yellow  on  tins  paper  so  I  will  mark  this  score  C;  the 
second  answer,  etc;  tins  answer  is  wrong  so  I  will  mark 
oat  the  score  with  a  cross.  This  is  the  last  answer  on 
the  paper,  so  I  will  mark  out  all  the  rest  of  the  scores. 
Xow  I  will  add  all  the  scores  not  crossed  out  and  write 
the  sun  here.  Score  your  paper  in  this  same  way.  If 
you  don't  understand,  study  die  instructions  on  the  card. 
When  you  have  found  your  total  score,  record  it  after 
the  word  'Score'  here  (pointing)."  Collect  the  cards 
by  rows. 

Have  the  papers  exchanged  again*  Distribute  new 
score  cards.  Have  each  child  fill  out  a  card  from  the 
paper  he  now  has,  following  the  same  procedure  as  before. 
Collect  the  cards. 

TXABUE  LANGUAGE  SCALE 

Instructions. — "I  have  another  reading  test  to-day. 
This  sheet  contains  some  incomplete  sentences,  which 
form  a  scale.  This  scale  is  to  measure  how  carefully 
and  rapidly  you  can  think,  and  especially  how  good  you 
are  in  your  language  work.  You  are  to  write  one  word 
on  each  blank,  in  each  case  selecting  the  word  which 
makes  the  most  sensible  statement.  You  may  have  just 
seven  minutes  in  which  to  sign  your  name  at  die  top  of 
the  page  and  write  the  words  that  are  missing.  The 
paper  will  be  passed  to  you  face  downward.    Do  not 


Answer  and  Record  Card 


■juuu  Retdinj  Tot 

-/^7 


INSTRUCTIONS  ?■*■  J 

Th«  opposite  tide  of  this  card  gives  the  correct  uuwen. 
spare  each  anawer  on  the  test  paper  with  the  corresponding 
war  oa  the  card.  When  an  answer  U  wrong,  draw  a  line  through 
value  for  that  anawer  in  the  column  headed  Score.  In  the  same 
'  mark  out  the  score*  of  all  exercises  not  tried.  Find  the  total  of 
the  acorea  not  nrnaard  out  and  record  it  at  bottom  of  the  column 
■Acr  the  word  score  shore. 

Answers  to  tens  for  Grades  S,  6,  7,  8 
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Osad      * 
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18  aad  tO.t 

tl  and  tt.t 

SI  aad  tt.t 

between  27  aad  tt.t 

between  tO  aad  64.9 

between  66  aad  tt.t 

between  40  and  44. t 

between  46  and  49.9 

between  60  aad  69.9 

between  60  aad  69.  t 

between  70  aad  7t.t 

60  and  above.   ^ . . . 
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"3"  ia  tho  troop  marked  *"bo» 
tween  7  and  AJ,"  aad  so  oa 
until  tho  whole  number  sf 
scores  are  recorded.  Thoaaai 
of  these  numbers  mast  equal 
tho  number  of  chfldraa  taktaf 
the  test. 

The  median  score  is  the  score  oa 
the  middle  paper  ia  tho  pOa  of 
as  arranged  aceocdiat;  to 
of  scores?  If  there  are  tt 
papers,  the  median  score  is  tho 
score  oa  tho  18th  paper.  If 
there  are  36  papers,  tho  me- 
dian score  is  half  way  between 
tap  score  oa  the  ltth  paper  an* 
the  score  oa  the  ltth  paper 
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turn  it  over  until  we  are  all  ready.  After  the  signal  is 
given  to  start,  remember  that  you  are  to  write  just  one 
word  on  each  blank  and  that  your  score  depends  on  the 
number  of  perfect  sentences  you  have  at  the  end  of  seven 
minutes." 

The  papers  are  then  distributed 

"After  you  have  been  working  seven  minutes,  I  shall 
say:  'Stop/  You  will  all  please  stop  at  once.  Now  if 
you  are  all  ready,  when  the  bell  rings  you  may  turn  your 
papers  over,  sign  your  names  and  fill  the  blanks." 

Five  seconds  before  the  signal  from  the  automatic 
timer,  the  warning:  "Get  ready"  was  given,  and  when 
the  bell  rang,  the  command  "Start"  was  also  given. 
Seven  minutes  later,  when  the  bell  rang  the  second  time, 
the  command  "Stop"  was  given,  and  then  the  instruc- 
tions: "Mark  a  cross  just  before  the  number  of  the 
sentence  at  which  you  stopped.  Now  write  'boy*  or 
'girl'  after  your  name  at  the  top  of  the  sheet.  Then 
just  below  your  name,  write  your  age,  your  grade,  and 
the  hour.  Turn  your  papers  face  down.  The  children 
at  the  back  of  the  room  please  collect  the  papers  for  me." 
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'This  sheet  was  designed  for  the  test  in  copying  figures,  but  was  adapted 
to  the  needs  of  this  test  as  above. 
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